8000 ESP8266: Extremely bad clock accuracy · Issue #2724 · micropython/micropython · GitHub
[go: up one dir, main page]

Skip to content

ESP8266: Extremely bad clock accuracy #2724

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Adam5Wu opened this issue Dec 26, 2016 · 18 comments
Open

ESP8266: Extremely bad clock accuracy #2724

Adam5Wu opened this issue Dec 26, 2016 · 18 comments

Comments

@Adam5Wu
Copy link
Adam5Wu commented Dec 26, 2016

Currently most time functions heavily depend on the built-in RTC in ESP8266.
However, the current implementation of RTC is sub-optimal (more on this later).

As a result, the clock accuracy is, basically, horrible.
I found this problem when I am trying to implement a clock using ESP8266 and MicroPython.
The clock drifts 1 second every 30~80 seconds!

I wrote a small test code to illustrate the problem:

import time
import ntptime

def run():
	# Initialize time with ntp
	ts_ntp = ntptime.time()
	ts_time = time.time()

	ts_diff = ts_ntp - ts_time
	print("Initial time offset = %d"%(ts_diff))
	cnt = 0
	while True:
		# Wait 10 sec
		time.sleep_ms(10000)
		cnt+= 1
		try:
			ts_ntp = ntptime.time() - ts_diff
			ts_time = time.time()
			print("%.3d: Time drift = %d"%(cnt, ts_ntp - ts_time))
		except Exception as e:
			print("Error query NTP time: %s"%e)

It basically compares the system time with NTP every 10 seconds and output time drift.

Ideally, the time drift should be zero.
On my ESP8266, the output look like:

001: Time drift = 0
...
007: Time drift = 2
...
013: Time drift = 3
...
018: Time drift = 4
...
022: Time drift = 5
...
028: Time drift = 6
...
032: Time drift = 7
033: Time drift = 8
...
038: Time drift = 9
...
060: Time drift = 14
...

The source of inaccuracy is the use of system_rtc_clock_cali_proc().
ESP8266's rtc frequency is subject to environmental factors, such as temperature, voltage, etc, but the current code just obtain a single value of system_rtc_clock_cali_proc(), and store it for use indefinitely.

While a more proper implementation of rtc can probably solve the problem, actually there is a much simpler alternative -- the system clock. ESP8266 provides a system clock, which is independent from the rtc. According to Timekeeping on ESP8266 & arduino uno WITHOUT an RTC, the accuracy of ESP8266's system clock is pretty good (1 sec / day).

I am working on a patch, which enable access to ESP8266's system clock, and see if it helps!

@Adam5Wu
Copy link
Author
Adam5Wu commented Dec 26, 2016

Test code for time drift of rtc:

import time
import utime
import machine
import ntptime

def run():
	# Initialize rtc time with ntp
	ts_ntp = ntptime.time()
	ts_tm = utime.localtime(ts_ntp)
	ts_day = ts_ntp - ts_tm[3]*3600 - ts_tm[4]*60 - ts_tm[5]
	ts_ntp-= ts_day

	rtc = machine.RTC()
	now = rtc.datetime()
	ts_rtc = now[4]*3600*1000 + now[5]*60*1000 + now[6]*1000 + now[7]

	ts_diff = ts_ntp*1000 - ts_rtc
	print("Initial time offset = %d"%(ts_diff))
	cnt = 0
	while True:
		# Wait 10 sec
		time.sleep_ms(10000)
		cnt+= 1
		try:
			ts_ntp = ntptime.time() - ts_day
			now = rtc.datetime()
			ts_rtc = now[4]*3600*1000 + now[5]*60*1000 + now[6]*1000 + now[7]
			print("%.3d: Time drift = %d"%(cnt, ts_ntp*1000 - ts_rtc - ts_diff))
		except Exception as e:
			print("Failed to query NTP time: %s"%e)

Compares the rtc time with NTP every 10 seconds and output time drift in milliseconds.

@Adam5Wu
Copy link
Author
Adam5Wu commented Dec 26, 2016

Success. :) PR #2726

  1. I have implemented access functions to non-RTC system clock, and modified general time function (time.time) to use this clock.

    Running 20 minutes of the time tracking code shows no observable clock drift.

  2. I have also renovated the RTC part of clock code. Now, each RTC clock reading uses the latest system_rtc_clock_cali_proc() for calculating time passed since the last RTC clock reading. While this is still an approximation, it aligns with the recommended usage of other utime functions -- calling it more frequently results in better accuracy.

    Running 20 minutes of time time tracking code shows clock drifts are confined within 1000ms.

@nubcore
Copy link
nubcore commented Jan 11, 2017

any idea on how to address in deep sleep, I presume the RTC must be used as basically everything else is turned off? and thanks for the work on this, just started scratching my head on this one.

@Adam5Wu
Copy link
Author
Adam5Wu commented Jan 12, 2017

deepsleep is handled, both existing and proposed implementations do it in similar fashion -- by projection.

Before deepsleep, the counter value of projected wake time is computed and stored in RTC memory. When wakeup happens, RTC counter is reset to zero, and projected counter value is loaded back from RTC memory as base. Ignoring the (not too small) errors induced by projection, the clock is mostly on track with real world time.

Caveat 1 - as found earlier in this issue, using a single rtc_cali value across a long time can incur 1 second of error every 30-80 seconds. So if you deepsleep 600 seconds, expect as much as 10-20 seconds of error.

Caveat 2 - wake up time projection works if the wake up happen at scheduled time. If manual reset is triggered before scheduled wakeup, the projection will be wrong and the clock will run ahead of real world clock.

@Benhgift
Copy link
Benhgift commented Feb 13, 2017

@Adam5Wu Thank you for doing this. I'm also building a clock and really appreciate that you worked on fixing this. And thanks to @nubcore and @dpgeorge for reviewing.

Looking forward to having working time.time()! Currently I pull the time from https://timezonedb.com

@PvdBerg1998
Copy link

Hi guys, is this issue fixed? I'm running the latest version and I'm still seeing the same time shift. I'm using the RTC and synchronising using NTP. Every minute it shifts by ~1 second.

@PvdBerg1998
Copy link

Ok I see the PR was not merged. The issue appears to be that you're reading the calibration value once and then using it forever, if I'm understanding it correctly

@danielmader
Copy link

Hello, I came across the same issue (trying to make a NeoPixel clock using esp8266-20200911-v1.13.bin). Over a night, I'm off by more than 10minutes. Is there any solution of how to get a better accuracy of the system clock so that time syncing once a day is enough? Thanks many times in advance!

@robert-hh
Copy link
Contributor
robert-hh commented Nov 7, 2020

A quick test showed, that on my ESP8266 time.time() is extremely bad, like 5% off the expected tick. But time.ticks_ms() is very precise, being off by only 7 E-6, or 0.6 seconds/day.
The only complication would be setting ticks_ms() to the proper time, but that could be done using the time from ntptime() as offset.
Edit: Setting up a timer with 1 seconds period is also quite precise and makes counting time easier. You just can get the initial seconds time delivered by ntptime and count that up.

@PvdBerg1998
Copy link

I believe the PR fixes the issue, but you need to compile the project yourself

@danielmader
Copy link

Thank you for your comments! @robert-hh: I've tried utime.ticks_ms() but I don't see a real improvement. Using a timer with 1000ms period is about the same quality as utime.time(), so it won't get me through a night without NTP sync (10-15mins off).
It seems I need to try a build with the fix, but I'm new to this, so I'll need to do some research.
I'd really appreciate the fix in the official build...

@robert-hh
Copy link
Contributor
robert-hh commented Nov 8, 2020

That's strange. ticks_ms() and the timer irq are derived from the main crystal, wherea time.time() uses the RTC as source, which runs from a RC oscillator. And that is usually precise. I used the little script below for testing. It creates a 0.5 Hz square wave. And that one was on spot with about 10E-5 error.
Tested with both a counter and an oscilloscope. The period was seen as 2.000015s in average.

from machine import Pin, Timer

c = 0
p = Pin(5, Pin.OUT)

def ptc(t):
    global c, p
    c += 1
    p(c&1)

tim = Timer(-1)
tim.init(mode = Timer.PERIODIC, period=1000, callback=ptc)

@danielmader
Copy link
danielmader commented Nov 9, 2020

Thanks a lot, @robert-hh! Your posts helped me to track down the issue further: I can confirm that both ticks_ms() and the timers are pretty exact! (My fault was to not set up offsets correctly, which led me down a wrong path -_-.) In about 6 hours I could not observe any visible difference.
However, in the course of the night, something odd happend. The interrupt system and ticks_ms() seemed to slow down, i.e. a second took visibly longer, while time.time() was still doing better (but also very wrong):

actual time: 07:18:21
time.time(): 06:06:17
timer(1000): 03:32:18
ticks_ms(): 03:32:18

I've now changed my code to not print out to stdout but only to the OLED. Maybe the prints caused the µC to choke?

## system modules
import utime as time
import ntptime

## custom modules
from init_oled import oled
from init_wlan import wlan
# print('>> display:', oled)
# print('>> wifi:' , wlan, wlan.ifconfig())

##*****************************************************************************
##*****************************************************************************

##=============================================================================
def sync_time_NTP():
    '''
    Update RTC (NTP sync).
    '''
    for _ in range(99):
        try:
            ntptime.settime()
            ## TODO: exact offset for time.tick_ms()
            ts_offset = time.time()
            print('\n>> NTP timestamp:', ts_offset)
            return ts_offset
        except OSError:
            time.sleep(0.5)

##=============================================================================
def get_time_NTP():
    '''
    Get current time via NTP.
    '''
    for _ in range(99):
        try:
            ts_now = ntptime.time()
            print('\n>> NTP timestamp:', ts_now)
            return ts_now
        except OSError:
            time.sleep(0.5)

##-----------------------------------------------------------------------------
ts_clocktick = 0  # noqa:E305
def clocktick(timer):
    '''
    Timer function to add one second.
    '''
    global ts_clocktick
    ts_clocktick += 1

    ## print timestamp ==> memory leak?
    # print(ts_clocktick)

##-----------------------------------------------------------------------------
def update_oled(timer):
    '''
    Timer function to update OLED screen.
    '''
    def _toString(timestamp):
        localtime = time.localtime(timestamp)
        year, month, mday, hour, minute, second, weekday, yearday = localtime
        timestr = "{:02d}:{:02d}:{:02d}".format(hour, minute, second)
        return timestr

    ## RT system clock
    ts_now1 = time.time()
    timestr1 = _toString(ts_now1)

    ## timed counter
    ts_now2 = ts_clocktick + ts_offset_ntp
    timestr2 = _toString(ts_now2)

    ts_now3 = int(round((time.ticks_ms() - ts_offset_ticksms) / 1000)) + ts_offset_ntp
    timestr3 = _toString(ts_now3)

    oled.fill(0)
    oled.textc(timestr1, 64, 10)
    oled.textc(timestr2, 64, 20)
    oled.textc(timestr3, 64, 30)
    oled.show()

    ## print timestamps as 'hh:mm:ss' ==> memory leak?
    # print()
    # print('utime.time()    ', timestr1)
    # print('clocktick()     ', timestr2)
    # print('utime.ticks_ms()', timestr3)


##*****************************************************************************
##*****************************************************************************

## init clock counters
ts_offset_ntp = sync_time_NTP()
ts_offset_ticksms = time.ticks_ms()
#ts_now = get_time_NTP()

## init timers
## https://docs.micropython.org/en/latest/library/pyb.Timer.html
## https://docs.micropython.org/en/latest/esp8266/quickref.html#timers
tim = machine.Timer(-1)
tim.init(period=1000, mode=machine.Timer.PERIODIC, callback=clocktick)
tim2 = machine.Timer(-1)
tim2.init(period=4500, mode=machine.Timer.PERIODIC, callback=update_oled)

## initial display
update_oled(None)

@robert-hh
Copy link
Contributor

I have a simple time counting script base on the timer irq running since yesterday 20:30. Compared to the PC clock, it is still exactly on time. Time.time() is now off by about 3400 seconds.
I would not use ticks_ms() for that purpose, because you have to handle the overrun of the ticks counter.
Maybe the two timers caused the trouble.

@danielmader
Copy link

@robert-hh: My script with the two timers is still running perfectly in sync for about 24hrs now (time.time() is off by +180s)! So maybe there is some kind of memory leak with prints to stdout? Or maybe such prints should be avoided at all, since also time.time() is off only about 3minutes, as opposed to hours in the past. I'll continue testing by re-adding the print statements.

@robert-hh
Copy link
Contributor

My test of the timer irq was still on spot after 2 hours, while time.time() showed a difference of ~4400 seconds. Print should not make a difference. Even if time.time() was only wrong by 180 s, that's still too much for a digital clock. That was good for low quality mechanical watches.

antoniotejada added a commit to antoniotejada/lessmostat that referenced this issue Oct 16, 2021
…dicator

When looking at utilization, it would jump around a lot across on/off
boundaries, since while on, the browser is used as utilization reference clock
and while off, the esp8266 is the one used. It's normal to have some drift
between different machines synced via NTP but the drift was too much and
increasing (in the order of 700 seconds after some time running).

The underlying problem is that the esp8266 RTC drifts badly, see
micropython/micropython#2724

To workaround that, this change does two things:
- In esp8266, NTP re-sync every so many seconds
- In the browser, use the timestamp from the most recent esp8266 MQTT message to
  calculate the utilization (theoretically this gives less granularity in the
  browser, but because the utilization display is only updated when an MQTT
  message from esp8266 comes, the displayed granularity stays the actually the
  same).

Files:

lessmostat.html
- Changed fan state to display on the "cooling" prompt as "blowing" when only
  the fan is on
- Added logging of the drift between the browser clock and the message arrived
  timestamp (so this number is not just drift, since it includes the delay in
  sending and receiving the message).
- Added "blowing" indicator when the fan is on but ac is off

lessmostat.py
- Added reported internal drift (ie esp8266 drift vs. NTP), this includes some
  delay for the NTP message and calculation.
- Added sync with NTP every 240 seconds. This keeps the reported drift below 7s
  and below 2s most of the time

Tests:
- Ran and observed reported drift in the browser and in webrepl.
- Verified webrepl reported absolute drift is less than 7 (-7) for a 240 second
  NTP sync time (and normally less, around -2)
@gmazilla
Copy link
gmazilla commented Dec 8, 2022

problem is still there
in my case time.time is slower than ntptime.time by 90sec per hour, or -2.5% error
micropython version 1.19.1, esp8266 nodemcu lolin v3 (from aliexpress)

@jonnor
Copy link
Contributor
jonnor commented Sep 8, 2024

There was an MR for this issue, that people say fixed the issue. So probably picking that up again and getting it into mainline will need needed to fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants
0