Hacker News new | past | comments | ask | show | jobs | submit login
setBigTimeout (evanhahn.com)
177 points by cfj 18 hours ago | hide | past | favorite | 99 comments





The default behaviour of setTimeout seems problematic. Could be used for an exploit, because code like this might not work as expected:

    const attackerControlled = ...;
    if (attackerControlled < 60_000) {
      throw new Error("Must wait at least 1min!");
    }

    setTimeout(() => {
      console.log("Surely at least 1min has passed!");
    }, attackerControlled);

The attacker could set the value to a comically large number and the callback would execute immediately. This also seems to be true for NaN. The better solution (imo) would be to throw an error, but I assume we can't due to backwards compatibility.

A scenario where an attacker can control a timeout where having the callback run sooner than one minute later would lead to security failures, but having it set to run days later is perfectly fine and so no upper bound check is required seems… quite a constructed edge case.

The problem here is having an attacker control a security sensitive timer in the first place.


The exploit could be a DoS attack. I don't think it's that contrived to have a service that runs an expensive operation at a fixed rate, controlled by the user, limited to 1 operation per minute.

A minimum timing of an individual task is not a useful rate limit. I could schedule a bunch of tasks to happen far into the future but all at once for example.

Rate limits are implemented with e.g., token buckets which fill to a limit at a fixed rate. Timed tasks would then on run try to take a token, and if none is present wait for one. This would then be dutifully enforced regardless of the current state of scheduled tasks.

Only consideration for the timer itself would be to always add random jitter to avoid having peak loads coalesce.


I don't think it's that far fetched that a developer implements a rate limiter with setTimeout, where a task can only be executed if a timeout is not already running. The behaviour in the article is definitely a footgun in this scenario.

> I don't think it's that contrived to have a service that runs an expensive operation at a fixed rate, controlled by the user

Maybe not contrived but definitely insecure by definition. Allowing user control of rates is definitely useful & a power devs will need to grant but it should never be direct control.


Can you elaborate on what indirect control would look like in your opinion?

No matter how many layers of abstraction you put in between, you're still eventually going to be passing a value to the setTimeout function that was computed based on something the user inputted, right?

If you're not aware of these caveats about extremely high timeout values, how do any layers of abstraction in between help you prevent this? As far as I can see, the only prevention is knowing about the caveats and specifically adding validation for them.


> that was computed

Or comes from a set of known values. This stuff isn't that difficult.

This doesn't require prescient knowledge of high timeout edge cases. It's generally accepted good security practice to limit business logic execution based on user input parameters. This goes beyond input validation & bounds on user input (both also good practice but most likely to just involve a !NaN check here), but more broadly user input is data & timeout values are code. Data should be treated differently by your app than code.

To generalise the case more, another common case of a user submitting a config value that would be used in logic would be string labels for categories. You could validate against a known list of categories (good but potentially expensive) but whether you do or not it's still good hygiene to key the user submitted string against a category hashmap or enum - this cleanly avoids using user input directly in your executing business logic.


>Can you elaborate on what indirect control would look like in your opinion?

although not the OP this is what I would mean by indirect control.

pseudo if userAccountType === "free" then rate = longRate

if userAccountType === "base" then rate = infrequentRate

if userAccountType === "important" then rate = frequentRate

obviously rate determination would probably be more complicated than just userAccountType


That's just terrible input validation and has nothing to do with setTimeout.

If your code would misbehave outside a certain range of values and you're input might span a larger range, you should be checking your input against the range that's valid. Your sample code simply doesn't do that, and that's why there's a bug.

That the bug happens to involve a timer is irrelevant.


You are a bad programmer if you think silently doing the wrong thing is not a bug. The right thing to do with unexpected input as the setTimeout library author is to raise an exception.

It's in the standard library. You're a bad programmer if you don't learn the ins and outs of the standard library, or make sweeping generalizations.

Standard library is an API just like any other library. The only thing different about it is backward compatibility (which in JS is paramount and the is reason setTimeout can't be fixed directly). It is a bad design still.

> That's just terrible input validation and has nothing to do with setTimeout.

Except for the fact that this behaviour is surprising.

> you should be checking your input against the range that's valid. Your sample code simply doesn't do that, and that's why there's a bug.

Indeed, so why doesn't setTimeout internally do that?


> Indeed, so why doesn't setTimeout internally do that?

Given that `setTimeout` is a part of JavaScript's ancient reptilian brain, I wouldn't be surprised it doesn't do those checks just because there's some silly compatibility requirement still lingering and no one in the committees is brave enough to make a breaking change.

(And then, what should setTimeout do if delay is NaN? Do nothing? Call immediately? Throw an exception? Personally I'd prefer it to throw, but I don't think there's any single undeniably correct answer.)

Given the trend to move away from the callbacks, I wonder why there is no `async function sleep(delay)` in the language, that would be free to sort this out nicely without having to be compatible with stuff from '90s. Or something like that.


I think it's more likely that it's just "undefined behaviour" and up to the implementers of the JavaScript engines. Given that modern browsers do limit and throttle how much you can do with setTimeout in some situations (try to use setTimeout on a page after you've switched to a VR context! More than like 120hz and it'll just.... Not run the timeout anymore, from experience with Chrome).

The browser devs have decided it's acceptable to change the behaviour of setTimeout in some situations.

https://developer.chrome.com/blog/timer-throttling-in-chrome...


In nodejs you at least get a warning along with the problematic behavior:

    Welcome to Node.js v22.7.0.
    Type ".help" for more information.
    > setTimeout(() => console.log('reached'), 3.456e9)
    Timeout { <contents elided> }
    > (node:64799) TimeoutOverflowWarning: 3456000000 does not fit into a 32-bit signed integer.
    Timeout duration was set to 1.
    (Use `node --trace-warnings ...` to show where the warning was created)
    reached
I'm surprised to see that setTimeout returns an object - I assume at one point it was an integer identifying the timer, the same way it is on the web. (I think I remember it being so at one point.)

It returns an object for a long time now, I might say it was always like this actually. Don't know about very old versions

It's return your differs between node and in a browser. If you want to type a variable to hold the return value in typescript and share that across node (eg jest tests where you might include @types/node) and the browser you need ReturnType<typeof setTimeout>, otherwise the code won't typecheck in all cases. Similar with setInterval.

I always try to force the timeout to 0 on those really annoying download sites that try to make me wait.

Sometimes the wait is over before I find the responsible code, and sometimes it does check server-side, but that's just part of the fun...


One could imagine an app that doubles the wait between each failed authentication attempt could exploit this by doggedly trying until the rate limiter breaks. Maybe not the most practical attack, but it is a way this behavior could bite you.

Don’t ever use attacker controlled data directly in your source code without validation. Don’t blame setTimeout for this, it’s impolite!

The problem is the validation. You'd expect you just have to validate a lower bound, but you also have to validate an upper bound.

It's user input, you have to validate all the bounds, and filter out whatever else might cause problems. Not doing so is a a problem with the programmer, not setTimeout.

This type of thing is actually practical. Google Cloud Tasks have a max schedule date of 30 days in the future so the typical workaround is to chain tasks. As other commenters have suggested you can also set a cron check. This has more persistent implications on your database, but chaining tasks can fail in other ways, or explode if there are retries and a failed request does trigger a reschedule (I hate to say I’m speaking from experience)

True. Though if you have a need to trigger something after that much time, you might recognize the need to track that scheduled event more carefully and want a scheduler. Then you’ve just got a loop checking the clock and your scheduled tasks.

Right on. Pretty quickly that's the better solution

In response to this, I read the spec of setTimeout, bu I couldn't find the part where implementations may have an upper bound. Can someone more familiär with the specs point me in the right direction?

Adding a comment here to check back later because I'm curious now if someone has the answer. I thought it would be easy to find the answer, but I can't find it either. I figured it would say somewhere a number is converted to an int32, but instead I got to the part where there's a map of active timers[1] with the time stored as a double[2] without seeing a clear loss happening anywhere before that.

[1] https://html.spec.whatwg.org/multipage/timers-and-user-promp...

[2] https://w3c.github.io/hr-time/#dom-domhighrestimestamp


If we're pedantic, this doesn't actually do what's advertised, this would be waiting X timeouts worth of event cycles rather than just the one for a true Big timeout, assuming the precision matters when you're stalling a function for 40 days.

I haven’t looked at the code but it’s fairly likely the author considered this? eg the new timeout is set based on the delta of Date.now() instead of just subtracting the time from the previous timeout.

No, it pretty much just does exactly that.

    const subtractNextDelay = () => {
      if (typeof remainingDelay === "number") {
        remainingDelay -= MAX_REAL_DELAY;
      } else {
        remainingDelay -= BigInt(MAX_REAL_DELAY);
      }
    };

Oh yikes. Yeah; not ideal.

To be fair, this is what I expect of any delay function. If it needs to be precise to the millisecond, especially when scheduled hours or days ahead, I'd default to doing a sleep until shortly before (ballpark: 98% of the full time span) and then a smaller sleep for the remaining time, or even a busy wait for the last bit if it needs to be sub-millisecond accurate

I've had too many sleep functions not work as they should to still rely on this, especially on mobile devices and webpages where background power consumption is a concern. It doesn't excuse new bad implementations but it's also not exactly surprising


I guess the dream of programming the next heliopause probe in JavaScript is still a ways off hahaha! :)

That wouldn't very well because Date.now() isn't monotonic.

There is a monotonic time source available in JavaScript, though: https://developer.mozilla.org/en-US/docs/Web/API/Performance...

As I understand it, the precision of such timers has been limited a bit in browsers to mitigate some Spectre attacks (and maybe others), but I imagine it would still be fine for this purpose.


Each subtracted timeout is a 25 day timer, so any accumulated error would be miniscule. In your example there would a total of 2 setTimeouts called, one 25 day timer and one 15 day. I think the room for error with this approach is smaller and much simpler than calculating the date delta and trying to take into account daylight savings, leap days, etc. (but I don't know what setTimeout does with those either).

Or maybe I'm missing your point.


You don’t need to take into account daylight savings or leap days when dealing with unixtime.

Sounds a lot like the famous windows 95 bug when it would crash after 49.7 days of uptime [1]

[1] https://news.ycombinator.com/item?id=28340101


This makes me love having Go handy. I find working with signals and time based events so much nicer than other languages I use.

This is fun, though. JS is a bucket of weird little details like this.


Go timers do have weird little details, in fact one little detail changed recently in 1.23 and broke my code. A third party dependency selects on sending to a channel and time.After(0); before 1.23, due to timer scheduling delay, the first case would always win if the channel has capacity to receive, but since 1.23 the timer scheduling delay is gone and the “timeout” wins half the time. The change is documented at https://go.dev/wiki/Go123Timer but unless you read release notes very carefully (in fact I don’t think the race issue is mentioned in 1.23 release notes proper, only on the separate deep dive which is not linked from release notes) and are intimately familiar with everything that goes into your codebase, you can be unexpectedly bitten by change like this like me.

> In most JavaScript runtimes, this duration is represented as a 32-bit signed integer

I thought all numbers in JavaScript were basically some variation of double precision floating points, if so, why is setTimeout limited to a smaller 32bit signed integer?

If this is true, then if I pass something like "0.5", does it round the number when casting it to an integer? Or does it execute the callback after half a millisecond like you would expect it would?


When implementing a tiny timing library in JS a few years back I found that most engines indeed seem to cast the value to an integer (effectively flooring it), so in order to get consistent behaviour in all environments I resorted to always calling Math.ceil on the timeout value first [1], thus making it so that the callbacks always fire after at least the given timeout has passed (same as with regular setTimeout, which also cannot guarantee that the engine can run the callback at exactly the given timeout due to scheduling). Also used a very similar timeout chaining technique as described here, it works well!

[1]: https://github.com/DvdGiessen/virtual-clock/blob/master/src/...


You're correct about JS numbers. It works like this presumably because the implementation is written in C++ or the like and uses an int32 for this, because "25 days ought to be enough for everyone".

I thought most non-abandoned C/C++ projects have long switched to time_t or similar. 2038 is not that far in the future.

Yes but JS always has backwards compatibility in mind, even if it wasn’t in the spec. Wouldn’t be surprised if more modern implementations still add an arbitrary restriction.

There's a shocking amount of systems that still have 32 bit time_t.

Linux and glibc only started supporting it on 32bit systems in the current decade.


I mean, we still have 14 years to go. It's not like it's 1999 and everyone is freaking out about y2k. We still have plenty of time.

That doesn't mean it's fine to wait and leave it until the last minute, but we have quite a few last minutes left at this point.


> we have quite a few last minutes left at this point.

    C’est l’histoire d’un homme qui tombe d’un immeuble de 50 étages.

    Le mec, au fur et à mesure de sa chute, il se répète sans cesse pour se rassurer:
    
    "Jusqu’ici tout va bien."
    "Jusqu’ici tout va bien."
    "Jusqu’ici tout va bien..."

    Mais l’important c’est pas la chute, c’est l’atterrissage.
Tx'd:

    There's this story of a man falling off a 50 floor building. Along his fall the guy repeats to himself in comfort:

    "So far, so good"
    "So far, so good"
    "So far, so good..."

    What matters though is not the fall, but the landing.
- Hubert, in La Haine (1995), Mathieu Kassovitz

https://youtube.com/watch?v=U-v6QVlpReU


2038 is even "now" if you're calculating futures.

Debian conversion should be done mid2025.

JS numbers technically have 53 bits for integers (mantissa) but all binary operators turns it into a 32-bit signed integer. Maybe this is related somehow to the setTimeout limitation. JavaScript also has the >>> unsigned bit shift operator so you can squeeze that last bit out of it if you only care about positive values: ((2*32-1)>>>0).toString(2).length === 32

I assume by binary you mean logical? A + b certainly does not treat either side as 32bit.

Sorry, I meant bitwise operators, such as: ~ >> << >>> | &

You have captured my heart and imagination

setTimeout is stranger than you think.

We recently had a failed unit test because setTimeout(fn, 1000) triggered at 999ms. That test had ran more than a hundred times before just fine. Till one day it didn't.


setTimeout has no guarantees, and even if it did, your unit tests shouldn't depend on it.

Flaky unit tests are a scourge. The top causes of flaky unit tests in my experience:

    - wall clock time ( and timezones )
    - user time ( and timeouts )
    - network calls
    - local I/O
These are also, generally speaking, a cause of unnecessarily slow unit tests. If your unit test is waiting 1000ms, then it's taking 1000ms longer than it needs to.

If you want to test that your component waits, then mock setTimeout and verify it's called with 1000 as a parameter.

If you want to test how your component waiting interacts with other components, then schedule, without timers, the interactions of effects as a separate test.

Fast reliable unit tests are difficult, but a fast reliable unit test suite is like having a super-power. It's like driving along a windy mountainside road and the difference between one with a small gravel trap and one lined with armco barriers. Even though in both cases you can the safe driving speed may be the same, having the barriers there will give you the confidence to actually go at that speed.

Doing every you can to improve the reliably and speed of your unit test suite will pay off in developer satisfaction. Every time a test suite fails because of a test failing that had nothing to do with the changes under test, a bit more of a resume gets drafted.


>Fast reliable unit tests are difficult

Not difficult if you build your code (not just the test suite) around scheduling APIs (and queues implementations, etc.) that can be implemented using virtual time instead of CPU/wall clock time (I call that soft vs hard time).

Actually I find it a breeze to create such fast and deterministic unit tests.


I wonder if your 999ms was measured using wall-clock time or a monotonic time source? I imagine a wee time correction at an inopportune time could make this happen.

Interesting.

Maybe the system clock did a network time synchronisation during the setTimeout window.


Why does your unit test need to wait one second? Or are you controlling the system time, but it still had that error?

I don't think there is any guarantee that setTimeout will run at exactly 1000. Though didn't expect it to run earlier, it definitely could run later.

instead of chaining together shorter timeouts, why not calculate the datetime of the delay and then invoke via window.requestAnimationFrame (by checking the current date ofc).

Are you suggesting checking the date every frame vs scheduling long task every once in a long while? Can't tell if it is ironic or not, I'm sorry (damn Poe's law). But assuming not, it would be a lot more computationaly expensive to do that, timeouts are very optmized and they "give back" on the computer resources while in the meantime

No irony intended I can be this dumb. Your point did occur to me as I posted, was just grasping at straws for a "clean" solution

Unlike setTimeout, requestAnimationFrame callbacks are automatically skipped if the browser viewport is minimized or no longer visible. You wouldn’t want to miss the frame that matters!

also, not to mention that setBigTimeout would still work in serverside js, while requestanimationframe doesn't!

This is excellent. But I was hoping for a setTimeout that survived JavaScript environment restarts. Maybe setBigReliableTimeout is in your future? Hahaha! :)

I wish that I could actually see the code. I understand that it's chaining timeouts, but the git site is just garbage


This reminds me of when I am trying to find something in a cabinet, but don’t really look very hard, and my wife will say “did you even try?” and find it in ~1second.

https://git.sr.ht/~evanhahn/setBigTimeout/tree/main/item/mod...


yes, sourcehuts interface is just godawful

I agree it’s not the prettiest, but I had no trouble clicking on “tree” to get to the folder and then “mod.ts” to see the code.

One has still to know that "tree" stands for "source code".

That seems like... a normal thing to know?

Pre-GitHub, one of the most popular web git viewers (cgit) used "tree" in this way. Never found that to be confusing.

(In git, the listing of the files and directories at a particular commit is called a "tree". So it's correct. Just not as intuitive as you, personally, would like.)


Well it doesn't stand for "source code". It's the tree of directories and files.

This is not a sourcehut problem, it is a github problem. "Tree" is semantically correct.

What? Of course it's a Sourcehut problem. They chose to use that word and could choose to use a better one.

I clicked on "browse" under the refs: main section and found the code right away

What is the use-case for such a function?

Make a joke and have something to write a blogpost about, while letting your readers learn something new.

Off the top of my head, a cron scheduler for a server that reads from a database and sets a timeout upon boot. Every time the server is reboot the timeouts are reinitialized (fail safe in case of downtime). If upon boot there’s a timeout > 25 days it’ll get executed immediately which is not the behavior you want.

This should be an interval with a lookup.

Every five seconds check for due dates sooner than 10 seconds from now and schedule them.

The longer a delay the higher the odds the process exits without finishing the work.


Why would you do that in JS rather than just using cron for it?

Not having your timeout fire unexpectedly instantly is a good use-case IMO.

because no one asked. If you need shorter intervals than the minimum you can make a function that calls the other function multiple times in a row.

So the js engine converting the javascript number (a double?) To an int and it's rolling over?

Got hit with this one a few months ago.

Just out of curiosity, what was the use case for a really long timeout? Feels like most if not all long timeouts would be best served with some sort of "job" you could persist, rather than leaving it in the event queue.


To be fair, this will be fixed by browsers when it's within spitting distance of the scale of numbers setTimeout is normally used with. (not huge numbers) Like, if it's close enough that setTimeout(() => {}, 5000) will stop working a month later, that would be a major failure on the browser vendor's part. Much too close for comfort.

But I totally understand it not being a priority if the situation is: setTimeout(() => {}, 500000000) not working in X years.


this is the thing with JS and TS - the types and stuff, it's all good until you realise that all integers are basically int 52 (represented as float 64, with 52 bits for the fraction).

Yes, it's nice and flexible - but also introduces some dangerous subtle bugs.


2^53-1 I thought.

And no, they're not all that. There's a bunch that are 2^32 such as this timeout, apparently, plus all the bit shift operations.


Not ALL integers are 52 bit, BigInts were added on ECMAScript 2020.

This is great for the folks running serverless compute! You get to start a process and let it hang until your credit card is maxed out. /s

That was before DBOS -- the serverless platform that bills you only for CPU time, not wall clock time ;) see https://www.dbos.dev/blog/aws-lambda-hidden-wait-costs

I don't see how this pricing (or product in general) is any better than cloudflare workers.

To be clear, I am not trying to be mean, I'm just curious to hear why I would pick this over cf.


So... do they not charge for sitting idle and consuming memory?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: