Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I must be dumb, because every time I dive into async/await, I feel like I reach an epiphany about how it works, and how to use it. Then a week later I read about it again and totally lost all understanding.

What do I gain if I have code like this [0], which has a bunch of `.await?` in sequence?

I know .await != join_thread(), but doesn't execution of the current scope of code halt while it waits for the future we are `.await`-ing to complete?

I know this allows the executor to go poll other futures. But if we haven't explicitly spawned more futures concurrently, via something like task::spawn() or thread::spawn(), then there's nothing else the cpu can possible do in our process?

[0] https://github.com/async-rs/async-std/blob/master/examples/t...



A good example is say you want to handle 100k TCP sessions concurrently. You probably don't want to launch 100k threads considering the overhead in doing so and constantly switching between them. You also don't want to do things synchronously as you'll constantly be waiting on pauses instead of doing work on the 100k sessions. So you launch 100k instances of it as an async function and they all stay in a single thread (or couple of threads if you want to utilize multiple cores for the work) and instead of constantly waiting on pauses it simply works on the log of queued up events.

Same code flow just it allows you to launch the same thing multiple times without having to wait for the whole thing to finish sequentially or wait on the OS to handle your threads.


This is missing a crutial explanation that the underlying OS API are asynchronous.


Yeah for sure. In Java/C# I see people do this all the damn time. Use async method for REST endpoints then make a blocking DB call. Or even worse, make a non-async REST call to another service from inside an async handler.

As soon as you do that, your code isn't async anymore. And if you're using a framework like Vert.X or node that only runs one thread per core you're in big trouble.

The most reasonable answer I've seen to all this is Java's Project Loom. An attempt to make fibers transparently act like threads, so you can use regular threaded libraries as async code.

Rust is going to have the same problem Java does with async. A lot of code was written way before async was available, and it not always obvious whether something blocks.


It's possible to write crappy code, async or otherwise.

In my c# world, I use async methods for REST endpoints, which in turn use async calls for anything IO-bound (database, message bus, distributed key store, file system etc). I think more often than not, it's done correctly.


A message broker works here when you want async behaviour but you are integrating with sync code. To use your REST example, you receive the call, send a message to DoSomething and then immediately return http 202, perhaps with some id the ui can poll on (if required). Meanwhile, the DoSomething message queue is serviced by a few threads.


That works but it’s an uncommon pattern. Most people prefer to wait in my opinion. I single DB worker doing batch updates would probably be enough.


Does this mean that rust async is using poll/epoll/kqueue under the hood?


Yes, the executor will use whatever io multiplexing the platform provides (and it’s been coded to support).

If the executor is Tokio, it’s built on mio which will use one of kqueue, epoll or iocp depending on the platform: https://docs.rs/mio/0.6.19/mio/struct.Poll.html#implementati...


Strictly speaking, it's not tied to any particular method. It depends on your executor. That said, the most popular executor does use epoll/kqueue/iocp. (tokio)


Doesn't the executor need to be aware of all the different mechanisms that can be used to poll, and so there's an implicit coupling between the async function implementation and the executor?

For example, socket.read() might return a future that represents a read on a file descriptor. I don't know the internals of Rust's async support at all, but presumably the future is queued up and exposes some kind of trait that Tokio et al can recognize as being an FD so it can be polled on using the best API such as epoll_wait() or whatever.

But let's say there's some kernel or hardware API or something that has a wait API that isn't based on file descriptors, and I implement my own async function get_next_event() that uses this API. Do I need to extend Tokio, or the Rust async runtime API, to make it understand how to integrate this into its queues? In a non-FD case, wouldn't it have to spawn a parallel thread to handle waiting for that one future, since it can't be included in epoll_wait()?


I slightly mis-spoke in a sense, yeah. This stuff has changed a bunch over the last few years :)

So, futures have basically two bits of their API: the first is that they're inert until the poll is called. The second is that they need to register a "waker" with the executor before they return pending. So it's not so much that the executor needs to know details about how to do the polling; but the person implementing socket.read() needs to implement the future correctly. It would construct the waker to do the right thing with epoll. Tokio started before this style of API existed, and so bundles a few concepts in the current stack (though honestly, an integrated solution is nicer in some ways, so I don't think it's a bad thing, just that it makes it slightly easier to conflate the pieces since they're all provided by the same package.)

Async/await, strictly speaking, is 100% agnostic of all of this, because it just produces stuff with the Futures interface; these bits are inside the implementation of leaf futures. And executors don't need to know these details, they just need to call poll at the right time, and in accordance with their wakers.

I can't wait until the async book is done, it's really hard remembering which bits worked which way at which time, to be honest.


> And executors don't need to know these details, they just need to call poll at the right time, and in accordance with their wakers.

Is this true? Essentially this is claiming that an executor does not need to use mio (epoll/kqueue/...) to be able to execute futures that do async network i/o.

So who uses mio? Would each type implementing the Future trait use mio internally as a private detail? That is, using two such future types, would they maintain multiple independent kqueues and the executor isn't able to put them both in one?


mio is useful when you’re writing an application that runs on windows/Linux/Mac. But you can use futures on any platform, including embedded ones. Those would be coded against whatever API the system offers. There’s an embedded executor, for example.

Tokio uses mio to implement its futures that do async IO, so if you use Tokio, you use mio. You don’t have to use Tokio, though it is the most popular and most battle tested.

Many futures don’t do IO directly; for example, all of the combinator futures. Libraries can be written to be agnostic to the underlying IO, only using the AsyncRead/AsyncWrite traits, for example.


The TL;DR: is that, while the `std::future::Future` trait is generic, the actual type that implements this trait is often tied to a particular executor.


This really helped my practical understanding - Thanks!


async/await are coroutines and continuations (bear with me).

Here is synchronous code:

    result = server.getStuff()
    print(result)
Here is synchronous code, that tries to be asynchronous:

    server.getStuff(lambda result: print(result))
Once server.getStuff returns, the callback passed to it is called with the result.

Here is the same code with async/await:

    result = await server.getStuff()
    print(result)
Internally, the compiler rewrites it to (roughly) the second form. That's called a continuation.

That's pretty much it.

A more involved example.

Synchronous code:

    result = server.getStuff()
    second = server.getMoreStuff(result+1)
    print(result)
Synchronous code that tries to be asynchronous:

    server.getStuff(
        lambda result: server.getMoreStuff(
          result+1, 
          lambda result2: print(result2)
    ))
A lot of JS code used to look like this hideous monstrosity.

Async/await version:

    result = await server.getStuff()
    second = await server.getMoreStuff(result+1)
    print(result)
Remember again, that it is basically transformed by the compiler into the second form.


Thanks. Helpful. My question is, in this example:

    result = await server.getStuff()
    second = await server.getMoreStuff(result+1)
    print(result)
`await getStuff()` MUST terminate before `await getMoreStuff() ` begins. So this chunk alone is analagous to synchronous code, unless we're in the middle of a spawned task, and there are other spawned tasks in the executor that can be picked up.


Yup, your understanding is correct. That code behaves equivalently to the synchronous version, and the only benefit is that the thread can run other tasks while it's waiting for the getStuff() and getMoreStuff() to come back.

Async/await is really popular in the JavaScript community because in web apps, you usually only have a single thread of execution which you share with the browser UI code. So if your code made a network request synchronously, the user might not be able to scroll or click links or anything until it finished.


Yes, the idea is that the thread that is executing this piece of code can "steal" other work when it is awaiting on either of those methods.

Frankly, in the case of sequential flow like the above, I would rather write

  result = server.getStuff()
  second = server.getMoreStuff(result+1)
  print(result)
and have the runtime automatically perform work-stealing for me. No need for awaits. They just litter the code. This is what Go does.


Thus the FFI impact on Go when going over language boundary and workarounds like runtime.LockThread().


Go does not do that implicitly, there is an explicit "go" syntax.

Gevent in Python does something similar [implicit switching] using a dirty monkeypatching. It is great while it works. Sooner or later the explicit cooperative concurrency such as provided by async/await syntax wins (e.g., asyncio, trio, curio Python libraries)


In Go you need to manually tell the runtime to spawn a goroutine with the `go` keyword, which also «litter» the code…


Except in practice, go keyword is used much more coarsly and sparingly because you can group all the block of function calls under one big go call. With async, every single function has to be flagged as beeing asynchronous and be called differently ( although maybe some modern languages have a way to group all the await calls ?)


That's funny how gophers can at the same time defend the «explicitness» of the if-based error handling, and be annoyed to have syntactic annotations for the yield points for coroutines (because that's exactly what `await` is, versus the yield points silently added by the go compiler everywhere so the runtime can perform its scheduling).


not sure who you're refering to.. i certainly don't like many aspects of the go language. goroutines and the "go" keyword isn't one of them.


Correct, if there's nothing else that can be picked up it'll behave essentially the same as the non-async code. But it'll mean that adding other things to be done at a later stage is much much easier than trying to do so without it.


This is an incredibly helpful explanation. I went from not really knowing what all this mumbo jumbo was about to a useful mental model. Cheers!


I don't understand your examples of "Synchronous code that tries to be asynchronous". In fact, the examples you provided are of asynchronous code being...asynchronous. Callbacks are asynchronous (or to be fully correct, I should say that they allow one to program asynchronously, which is exactly what async/await does).

Indeed, since you mention continuations, I'm sure you realize that they're more or less callbacks.


    server.getStuff(gotStuff)
    server.getMoreStuff(gotMoreStuff)
Just using functions is more simple and also more powerful. A function being async usually means stuff will happen, things can go wrong, and you might want to do different things depending on the response or whether it failed.

Where await is useful though is in serial execution of async functions that really should be sync, but them being async is an optimization in order to not block the thread.

It is really unfortunately that so much extra crud had to be introduced to JS (corutines, async, promises) in order to be able to await. Reading complex Promise based code is very unplesent, with async functions pretending to be pure, without any error handling, full of side effects, and omitted returns.

With callbacks we had inexperienced programmers writing pyramids of callbacks and if logic. But it was not that bad, as the complexity was in your face, and not hidden under layers of leaky abstraction.


So, the power of async await is no greater than the thread pool manager sitting under it.

You are correct. In edge cases where there is only 1 await in the queue for 1 process with 1 thread you gain nothing.

But you're accurately describing an edge case where await has limited value.

Await's true power shows up when you anticipate having multiple in-flight operations that all will, at overlapping points, be waiting on something.

Rather than consume the current thread while waiting, you're telling the run-time, go ahead and resume another task that has reached the end of its await.

This was possible before using various asynchronous design patterns, but all of them were clunky, in that they required boilerplate code to do what the compiler should be able to figure out on its own:

"Hey, runtime. This is an asynchronous call. Go do something useful with this thread and get back to me."

Second, await is MUCH EASIER for future developers to process because it looks exactly like any other method call and makes it easy to reason about the logic flow of the code.

Rather than chasing down async callbacks and other boilerplate concepts to manually handle asynchronous requests, the code reads like its synchronous twin.

int a = await EasyToFollowAsyncIntent();

This makes the code much easier to reason about.

To me those are the 2 biggest gains from async.

1. Less boilerplate code for asynchronous calls.

2. Code remains linearly readable despite being highly asynchronous.


In small examples like this, you don't gain anything. For the sake of the example, we just run one task. But you _could_ run 100 with them. And at each of those `awaits`, they could schedule differently.

For a more complex networked application, we have the tutorial here: https://github.com/async-rs/a-chat


Wouldn't it make more sense to show an example that actually takes advantage of async/await? I don't get why they are using examples that need a disclaimer like you can run 100 jobs for this to make sense. So it should include that in the example (and it should probably do something that makes sense if it's run a hundred times).


The example is intended for you to be able to implement it, not as a showcase.

I think the expectation with Rust async-await at the moment is likely that people are familiar with async syntax from other languages e.g. Python - it's not even in beta yet, you need to be running nightly to get the syntax.


Yeah if there are no other futures spawned, then the await is going to cause the app to just sit there until the future completes. It's got nothing better to do.

If there was another future spawned then the await would cause the runtime to sit there until either of the futures completed. The code would attend to the first future that completes intil that hits an await.


And where it all comes together is when async lets us write concurrent functions that compose together well in a way that functions that can block on IO and lock acquision do not.


How so? The same API is trivial to implement using threads and futures. A future, after all, is just a one shot channel or rendezvous point.

Async is just a way to get cooperative threads compiled to a static state machine, trading lower concurrent utilization and throughput for less context switch overhead and lower latency.


Right, that specific instance is essentially a single-threaded* application.

Now imagine that you spawned a few hundred of them with JoinAll. Each would run, multiplexed within a single thread, with execution being passed at the await points.

* anyone know the correct nomenclature for this? Single-coroutine?


Cooperative threads are still threads, they just aren't preemptive.


I suppose you're right. The example is both single-threaded at the OS level and single-threaded at the program level.


Serial?


Async is a hard concept but what can be revealing is going through the three steps:

1. Get used to callbacks in NodeJS, for example write some code using fs that reads the content of a file, then provide a callback to print that content.

2. Get used to promises in NodeJS, for example turn the code in #1 into a promise by creating a function with that calls the success/reject handler as appropriate on the callback from opening that file. Then use the promise to open the file and use .then(...) to handle the action.

3. Now do it in async. You have the promise, so you just need to await it and you can inline it.

By doing it in the 3 steps I find it is more clear what is really happening with async/await.


Say you want to listen to a socket and also receive input from the keyboard at the same time.

If you have an async method that can wait for input on both devices, you can await the results of both of them, and they won't block eachother.


> I must be dumb

Nope, async really isn't trivial.

> I know .await != join_thread(), but doesn't execution of the current scope of code halt while it waits for the future we are `.await`-ing to complete?

It doesn't, that's the charm of it.

It's best to treat 'await' as syntactic sugar, and to dig in to the underlying concepts.

I realise we're not talking C#/.Net, but that's what I know: in .Net, your function might do slow IO (network activity, say) then process the result to produce an int. Your function will have a return-type of `Task<int>`. Your function will quickly return a non-completed Task object, which will enter a completed state only once the network activity has concluded and processing has occurred to give the final `int` value.

The caller of your function can use the `Task#ContinueWith` method, which enqueues work to occur if/when the Task completes, using the result value from the Task. (We'll ignore exceptions here.)

Internal to your function, the network activity itself will also have taken the form of a standard-library Task, and our function will have made use of its `ContinueWith` method. Things can compose nicely in this way; `Task#ContinueWith` returns another Task.

(We needn't think about the particulars of threads too much here, but some thread clearly eventually marks that Task object as completed, so clearly some thread will be in a good position to 'notice' that it's time to act on that `ContinueWith` now. The continuation generally isn't guaranteed to run on the same thread as where we started. That's generally fine, with some notable exceptions.)

You might think that chain-invoking `ContinueWith` would get tedious, as you'd have to write a new function for each step of the way if we make use of several async operations - each continuation means writing another function to pass to `ContinueWith`, after all. Perhaps it would be more natural to just write one big function and have compiler handle the `ContinueWith` calls.

You'd be right. That's why they invented the `await` keyword, which is essentially just syntactic sugar around .Net's `ContinueWith` method. It also correctly handles exceptions, which would otherwise be error-prone, so it's generally best to avoid writing continuations manually.

There's more machinery at play here of course, but that seems like a good starting point.

Assorted related topics:

* If you use `ContinueWith` on a Task which is already completed, it can just stay on the same thread 'here and now' to run your code

* It's possible to produce already-completed Task objects. Rarely useful, but permitted.

* There's plenty going on with thread-pools and .Net 'contexts'

* The often-overlooked possibility of deadlocking if you aren't careful [0]

* None of this would make sense if we had to keep lots of background threads around to fire our continuations, but we don't [1]

* Going async is not the same thing as parallelising, but Tasks are great for managing parallelism too

* This stuff doesn't improve 'straight-line' performance, but it can greatly improve our scalability by avoiding blocking threads to wait on IO. (That is to say, we can better handle a high rate of requests, but our speed at handling a lone request on a quiet day, will be no better.)

I found this overview to be fairly digestible [2]

[0] https://blog.stephencleary.com/2012/07/dont-block-on-async-c...

[1] https://blog.stephencleary.com/2013/11/there-is-no-thread.ht...

[2] https://stackoverflow.com/a/39796872/

See also:

https://docs.microsoft.com/en-us/dotnet/standard/parallel-pr...

https://docs.microsoft.com/en-us/dotnet/api/system.threading...


> It's best to treat 'await' as syntactic sugar, and to dig in to the underlying concepts.

Slight word of warning: `async/await` is more than just sugar in Rust, it also enables borrowing over awaits, which was previously not possible.


Interesting, thanks.


Nerding a bit more, it was a bit late yesterday: this is why the Future takes this weird type called a `Pin`, which is a guarantee that the value does not move in memory while the Future is polled. This is also one of the reasons the feature took so long, Rust previously only had ways to detect potential moves in memory, but could not disallow them.

https://doc.rust-lang.org/std/future/trait.Future.html#requi...


Interesting ideas. I really must learn Rust properly.


async/await is all about letting you write serial looking code with the smallest memory footprint short of writing hand-coded continuation passing style (CPS) code.


User space threads and corouting is pretty much the right abstraction for application code.

Async can be useful when more control over the details of execution is needed.


Async/await and futures/promises (and before that, stuff like Java's Executor abstraction) are being added to a lot of languages because it is very difficult for even experienced developers to manage threads in a bug-free way.

I've seen a lot of people try to manage complex programs by working with threads directly. Whatever they come up with is very unlikely to be as correct and reliable as the abstractions provided by the language. Even when they get it right, programs written with those techniques are difficult to modify without introducing new bugs.

Manual management of threads is becoming like manual management of memory -- it is discouraged by newer language features and you should only do if you really need to.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: