How can we help?


RainMachine HD - hanging periodically

Comments

35 comments

  • Avatar
    RainMachine Nicholas

    Looks like an issue with wireless ? Could you go to Settings > System > Network Tools and take a picture of that screen ?

  • Avatar
    Pitmancd

    I have the same problem.  My RainMachine Touch HD-12 was "rock solid" for a couple of years. Then, in the past year or so, it seems to lose connection to my WiFi... or at least I can't see it on my local network from my phone. If I unplug it, it will reconnect and work just fine, but then a few weeks will pass, and I'll have to unplug/replug it again. It has firmware v4.0.974. Can someone please help? BTW, is there a way to automatically have it reboot periodically? This would be a great work-around to the issue and nice feature to have to where I can have it reboot at a a certain every day, or once a week, etc.

  • Avatar
    Pitmancd

    If it's a wireless issue, is there a good/better USB WiFi adapter that I should purchase?  Is there a recommended list?

  • Avatar
    RainMachine Nicholas

    Before you unplug can you check if it blinks the wrench or has WIFI signal ? Does the display come back to life if you touch it ?

    A list of known to work WIFI adapters: https://support.rainmachine.com/hc/en-us/community/posts/360009524834-HD12-16-Beta-version-4-0-944-list-of-known-working-WIFI-adapters

  • Avatar
    Ron I.

    Yeah that's my situation too, I basically never rebooted or touched the thing, then this started happening after the last upgrade. I'm not sure if it's something related to the upgrade or just a coincidence that it's been 2 years and the wifi adapters are starting to flake out.

    Incidentally, a periodic reboot is not a bad idea. If you have ssh access enabled, you could run a cron job or scheduled task from somewhere else on your network, to reboot via ssh once a week. There's a minor bug here though, also related to the wifi: if you do "ssh rainmachine reboot" it will reboot but never close the connection. Your ssh session will just hang forever. So if you're going to script this, you'll need to arrange for it to timeout.

  • Avatar
    RainMachine Nicholas

    Is this WIFI issue on a actual well determined period or just random ? If you just go to WIFI screen does it automatically reconnect without rebooting ?

  • Avatar
    ffuentes

    I have this issue as well. I send in my unit for repair with out of pocket expenses due to my system been out of warranty... rainmachine found no issues.... This issue is random and when you guys tested only tested for a few hours and determine that the system was "OK"

     

    I am glad I am not the only one with this problem and rainmachine can look at this closer!

  • Avatar
    RainMachine Nicholas

    We first need to separate the issues:

    1. Random WiFi disconnects

    2. Periodic freezing completely (as in not accessible over network, display not functioning on touch)

    Please detail the behavior as much as possible.

     

  • Avatar
    ffuentes

    In my case is just like describe by the OP, The system is partially accessible and responding to ping and partial requests.... But the application is unresponsive at the display. As stated by the OP the wrench is blinking... When I touch the wrench and I get the display lit I see the wireless signal in "sleep mode" but comes right on when the display lights up....

    Thats when I try to touch around I get the unresponsive message....

  • Avatar
    Alex Stevenson (Edited )

    I am also getting the occasional hangups.  When it happens I can't remote into the unit via app.  When I try to access the controller locally I see the attached image.

     

  • Avatar
    David Browning

    I am seeing something very similar.  I notice the wrench blinking and only getting 1 out of 4 replies back to ICMP (ping).  Web interface starts to come up but won't, can't connect in the app either.  If I go and wake up the device by touching the wrench it starts to respond normally.  I even captured some traffic with tcpdump this last time, this tcpdump is coming directly off of my AP (Ubiquiti AP-AC Pro in case that is somehow relevant).

    10.11.0.18 = RainMachine
    davids-air.localdomain = My Macbook which is performing the ping (on a different subnet/vlan but traffic is allowed in case that somehow matters)
    10.11.0.1 = pfsense gateway

    21:15:04.129967 IP davids-air.localdomain > 10.11.0.18: ICMP echo request, id 55751, seq 773, length 64
    21:15:05.133585 IP davids-air.localdomain > 10.11.0.18: ICMP echo request, id 55751, seq 774, length 64
    21:15:06.136745 IP davids-air.localdomain > 10.11.0.18: ICMP echo request, id 55751, seq 775, length 64
    21:15:06.262597 IP 10.11.0.1.67 > 10.11.0.18.68: BOOTP/DHCP, Reply, length 300
    21:15:07.141178 IP davids-air.localdomain > 10.11.0.18: ICMP echo request, id 55751, seq 776, length 64
    21:15:08.145215 IP davids-air.localdomain > 10.11.0.18: ICMP echo request, id 55751, seq 777, length 64
    21:15:09.148587 IP davids-air.localdomain > 10.11.0.18: ICMP echo request, id 55751, seq 778, length 64
    21:15:10.153367 IP davids-air.localdomain > 10.11.0.18: ICMP echo request, id 55751, seq 779, length 64
    21:15:10.326346 IP 10.11.0.1.67 > 10.11.0.18.68: BOOTP/DHCP, Reply, length 300
    21:15:11.156572 IP davids-air.localdomain > 10.11.0.18: ICMP echo request, id 55751, seq 780, length 64
    21:15:11.160998 ARP, Reply 10.11.0.1 is-at 00:90:0b:7a:8a:a6 (oui Unknown), length 42
    21:15:11.163911 IP 10.11.0.18 > davids-air.localdomain: ICMP echo reply, id 55751, seq 780, length 64
    21:15:12.161390 IP davids-air.localdomain > 10.11.0.18: ICMP echo request, id 55751, seq 781, length 64
    21:15:13.164706 IP davids-air.localdomain > 10.11.0.18: ICMP echo request, id 55751, seq 782, length 64
    21:15:14.166330 IP davids-air.localdomain > 10.11.0.18: ICMP echo request, id 55751, seq 783, length 64
    21:15:14.497179 IP 10.11.0.1.67 > 10.11.0.18.68: BOOTP/DHCP, Reply, length 300
    21:15:15.170071 IP davids-air.localdomain > 10.11.0.18: ICMP echo request, id 55751, seq 784, length 64
    21:15:15.174569 ARP, Reply 10.11.0.1 is-at 00:90:0b:7a:8a:a6 (oui Unknown), length 42
    21:15:15.176672 IP 10.11.0.18 > davids-air.localdomain: ICMP echo reply, id 55751, seq 784, length 64
    21:15:16.174641 IP davids-air.localdomain > 10.11.0.18: ICMP echo request, id 55751, seq 785, length 64
    21:15:17.180026 IP davids-air.localdomain > 10.11.0.18: ICMP echo request, id 55751, seq 786, length 64
    21:15:18.183240 IP davids-air.localdomain > 10.11.0.18: ICMP echo request, id 55751, seq 787, length 64
    21:15:18.716311 IP 10.11.0.1.67 > 10.11.0.18.68: BOOTP/DHCP, Reply, length 300
    21:15:19.186816 IP davids-air.localdomain > 10.11.0.18: ICMP echo request, id 55751, seq 788, length 64
    21:15:19.191838 ARP, Reply 10.11.0.1 is-at 00:90:0b:7a:8a:a6 (oui Unknown), length 42
    21:15:19.195414 IP 10.11.0.18 > davids-air.localdomain: ICMP echo reply, id 55751, seq 788, length 64
    21:15:20.188749 IP davids-air.localdomain > 10.11.0.18: ICMP echo request, id 55751, seq 789, length 64
    21:15:21.192215 IP davids-air.localdomain > 10.11.0.18: ICMP echo request, id 55751, seq 790, length 64
    21:15:22.195029 IP davids-air.localdomain > 10.11.0.18: ICMP echo request, id 55751, seq 791, length 64
    21:15:22.973060 IP 10.11.0.1.67 > 10.11.0.18.68: BOOTP/DHCP, Reply, length 300

    Once it is working again then I only see the ICMP request and replies and everything else is working too (web interface, app).  I also stop seeing the DHCP replies from my gateway, I never saw the requests here because I was running tcpdump host 10.11.0.18.  It seems like the unit is continually sending out DHCP requests when it gets into this state.

    This matches up with some of the logs on the RainMachine as well although tcpdump is showing it much more frequently.

    2019-03-14 21:02:49,889 - INFO  - rmThreadWatcher:301 - Refreshed WIFI Information. (old: '10.11.0.18' new ip: None)
    2019-03-14 21:03:04,037 - INFO  - rmThreadWatcher:301 - Refreshed WIFI Information. (old: None new ip: '10.11.0.18')
    2019-03-14 21:05:04,347 - INFO  - rmThreadWatcher:301 - Refreshed WIFI Information. (old: '10.11.0.18' new ip: None)
    2019-03-14 21:05:18,397 - INFO  - rmThreadWatcher:301 - Refreshed WIFI Information. (old: None new ip: '10.11.0.18')
    2019-03-14 21:07:18,681 - INFO  - rmThreadWatcher:301 - Refreshed WIFI Information. (old: '10.11.0.18' new ip: None)
    2019-03-14 21:07:30,719 - INFO  - rmThreadWatcher:301 - Refreshed WIFI Information. (old: None new ip: '10.11.0.18')
    2019-03-14 21:13:31,757 - INFO  - rmThreadWatcher:301 - Refreshed WIFI Information. (old: '10.11.0.18' new ip: None)
    2019-03-14 21:13:33,762 - INFO  - rmThreadWatcher:301 - Refreshed WIFI Information. (old: None new ip: '10.11.0.18')
    2019-03-14 21:15:34,075 - INFO  - rmThreadWatcher:301 - Refreshed WIFI Information. (old: '10.11.0.18' new ip: None)
    2019-03-14 21:15:36,083 - INFO  - rmThreadWatcher:301 - Refreshed WIFI Information. (old: None new ip: '10.11.0.18')
    2019-03-14 21:17:36,318 - INFO  - rmThreadWatcher:301 - Refreshed WIFI Information. (old: '10.11.0.18' new ip: None)
    2019-03-14 21:18:32,569 - INFO  - rmThreadWatcher:301 - Refreshed WIFI Information. (old: None new ip: '10.11.0.18

    Firmware version
    4.0.974
    Hardware revision
    3

    I honestly had been ignoring the issue until tonight so I am not sure how frequently this is occurring.  If there's something additional you'd like me to gather if I see it again please let me know.

  • Avatar
    RainMachine Nicholas

    These intermittent WIFI errors seems to affect some customers that have:

    1. Certain models of repeaters (only TP Link reported for now)

    2. Certain MESH WIFI solutions (Eero and Netgear. Netgear issues solved after a router firmware update)

    3. Certain WIFI routers with dual radios 2.4 and 5GHz on same network name (SSID)

     

    Some of the above WIFI issues were solved by using the Display setting: Local Unit UI > Setting > System > Display > Keep Display on but dim it after ...

    This setting make Android system query much more often the state of WIFI.

    For David issue I wonder if a static IP or a longer lasting DHCP lease setting would help.

     

    Since these issues seems to affect a small percentage of customers, we can only gather these issues internally, add reported hardware , if possible, to our WIFI testing network.

  • Avatar
    Alex Stevenson

    Thanks, Nicholas.  I do use the Google WiFi mesh system along with having the same SSID for both 2.4 and 5 GHz networks.

     

    I will try the workaround you mentioned with keeping the display dimmed.

     

    Thanks!

  • Avatar
    ffuentes

    Nicholas,

     

    I use openmesh. I am going to try out the workaround and report back.

     

    Regards,

    -

  • Avatar
    Ron I.

    Nicholas, I don't use any of the brands mentioned above, although I do have Ubiquiti APs like David Browning. I got a video when I had the issue earlier today. Note that it resolves itself after I turn on the screen and wait a few seconds:

    https://streamable.com/r7psn

    This has all of the hallmarks of a power management issue. Did the latest firmware update include a new Android kernel? Power management stuff can change significantly between kernel versions. I'm guessing the wifi adapter is getting put into sleep mode under certain conditions, which is why it helps to always keep the display on.

  • Avatar
    RainMachine Nicholas

    No kernel updates and our Android build has WIFI_SLEEP_POLICY_NEVER by default. Problem is that we cannot replicate this behavior, maybe it's related to a reauthentication  interval or protocol on newer routers, although we have up to date wpa_supplicant stack.

  • Avatar
    Neale B

    I have the same issue and I also have an Ubiquiti AP.  In my case, the device will show offline in the application and when I go to the panel, and touch the wrench icon, a message will pop up that the application is hung and asks me if I want to restart it. Once I restart the application, it seems to start working again. This usually happens on a 2 or 3 week interval.

  • Avatar
    Pitmancd (Edited )

    Mine locks up about every week or so.  Won't respond, have to unplug it to reset it.  Typical Windows OS issues.  I don't understand why the developers can't add an option for it to automatically reboot periodically, whether it's every day, or once a week, etc.  There's apparently a program issue, memory leak, etc.  So frustrating especially when my phone keeps getting dinged with RainMachine messages and I'm not at home to reset it.  Here's what the screen looks like when it locks up:

     

  • Avatar
    Pitmancd

    BTW, I have had the same ASUS RT-AC68P WAP that I have had since I bought the RainMachine.  It's rock solid with no issues with any other devices.  It seems that RainMachine issues started occurring after a RainMachine firmware update.  If there is some debugging I can do, gathering logs, etc., please let me know.

  • Avatar
    ffuentes

    I agree this all was working fine till a firmware update.

  • Avatar
    RainMachine Nicholas (Edited )

    We are investigating. Worth mentioning that RainMachine does have a hardware watchdog, which will reboot machine if irrigation and computation or irrigation would not work. This lockup seems to be WIFI related.

  • Avatar
    Aechelon

    Seeing the same behavior here.  Not using a mesh network.  Just a regular Netgear R7000 router. No WIFI repeaters.  HD-12 had been rock solid for years (literally since May 2016).  After the 4.0.974 update it disconnects from the WIFI every couple of weeks and will not reconnect.  Can't ssh or ping it.

    Rebooting gets it going for another couple of weeks.

    @Nicholas, is there a procedure to downgrade to the previous firmware while you investigate?

  • Avatar
    RainMachine Nicholas

    There isn't an easy procedure to get back to previous version. Problem seems quite strange to happen on "weeks" interval. 

    We will have a new beta soon that we will need feedback on and I'll post here once it's available.

  • Avatar
    ffuentes

    I havent had a system hang since I set my display not to dim.

  • Avatar
    Alex Stevenson

    I also have not had a system hang/crash since setting the display to stay on/dimmed.

  • Avatar
    Pitmancd

    I, too, have not had it hang since the display has been dimmed.  But... I don't want to leave it this way and ruin the display!

  • Avatar
    RainMachine Nicholas

    Thanks for the feedback !

     

  • Avatar
    David Browning

    So I paid closer attention and my issue appears to be exactly the same as everyone else with the flashing wrench, press a key to wake it up, see the application crash, then it starts to work again.  Since I sniffed traffic previously and did see the issue with DHCP I went ahead and assigned a static IP as suggested by Nicholas and have had zero issue since.  Would someone who had been successful with the always on display method be willing to try setting a static IP and see if that also works for them (obviously disable the display workaround)?  I figure the more points of data we can provide the more likely it will be a root cause can be found.

  • Avatar
    Aechelon

    I set a static IP last Thursday and haven't seen the issue since, but it'll take a few more days to confirm that fixes it.

  • Avatar
    ffuentes

    Mine has been static and the issue remains. It stop only when I set the screen not to sleep.

Please sign in to leave a comment.