i wrote something similar a while ago, and it gave rise to a pretty head-scratching bug... my code was doing something like
if type(x) is Foo:
and mysteriously failed for some data around the reload. turns out that Foo objects from before the reload were using a different class object than Foo objects after the reload (which makes sense in hindsight, as Foo was redefined). fun times debugging that!
That's a pain to debug. A similar thing happened to me with reload from Python's importlib. In the case of the reloading loop I think this will only happen when redefining the type within the loop, either through a class definition or an importlib reload.
yeah, importlib.reload caused mine as well! i'm wondering if there's a way to avoid that... maybe a @reloadable class decorator that saves a reference to each object you instantiate and switches out their .__class__ upon reload? obv not great for performance, but could be useful during development (which is where i'd mostly use hot reloading anyway)
I did a talk on this recently at EuroPython. Hot reloaders are really, really really hard to implement in Python in the general case (I.e not an IPython shell). You run into corner cases like this, and it basically comes down to reloading the whole app anyway.
I didn't go into as much detail as I would have liked about hot reloaders, but the tl;dr is that if you have a _single_ reference to something, then hot reloading it is somewhat simple. The problem comes when you want to hot reload a single module inside a large program.
Hot reloading a loop like this is super cool, but it only works if you're just changing constants and not defining anything. The moment you want to change the definitions of an object in the loop and hot reload it then you're in a lot of pain.
had to check to make sure isinstance wasn't doing some magic, but yes. the whole issue is that as far as python is concerned, the pre- and post-reload Foo classes aren't related in any way except by name and module, which aren't checked by either of these methods
(aside: i know that `isinstance` is the official way, but i just don't like the way it reads – "x is an instance of y" is an infix construct and looks weird in a prefix position. i don't really use inheritance so `type(x) is y` is good enough, and it reads very naturally)
When I first read this I couldn’t think of a good use case. Then the example shown seemed like something I deal with all the time and wish I could do. But then I realized I would need to remember to add reloading to the range before I commit to the long loop I wish to change. Pretty damn cool though.
If it doesn’t add much overhead this would be amusing for something like game dev, swapping out how the game works while running it.
"If you look at these languages in order, Java, Perl, Python, you notice an interesting pattern. At least, you notice this pattern if you are a Lisp hacker. Each one is progressively more like Lisp."
This is the basic thought. Turns out PG wrote it more explicitly than ESR.
This is great work, I could not believe the fact that it keeps state (like the current epoch number, and of course all the current model state). Makes Python even more dynamic.
Nice that you could condense it to less than 100 lines of Python!
This is a great debugging tool for deep learning. Often your code might have been running for 20 hrs on 16 GPUs and you just don't want to restart from scratch again.
You don't have to without this tool, you can save and load models from/to disk already from provided APIs for frameworks/libraries (e.g. [1]). Handy for when you need to hard reset, move, and/or share data.
Not sure I totally agree that this achieves those goals, but I can't argue they would be ideal goals.
>interactive
AFAIK, as you type python code it will execute with the changes incorporated in the next loop execution. That's a heavily constraint "interactive" (forward propagating changes against an active execution). I mean, wouldn't a python shell with shell history be more interactive and hit all your goals? Sure the advantage of this is you can change code based on output feedback, but if you are too slow, you have to backtrack, and if you are too fast, you just wait anyways in both cases. And if you want to add code to earlier loop executions, you are kaput for earlier loop executions since those are set in stone against a different version of the code?
> reproducible
How can you get this, i.e. reproduce any loop changes outside of IDE/editor code undo/redo stack. It's hiding all the magic of what code outputted what with the "reloading" function.
> recordable, replayable
This moves the "history" to the IDE/editor code undo/redo stack. I think this is worse, cause, what happens when you close the IDE/editor without recording which specific console output was for which version of the code?
My main gripe is this sentence in the README:
> This lets you e.g. add logging, print statistics or save the model without restarting the training and, therefore, without losing the training progress.
If the italicised is the prime concern, and the purpose of this is to solve that, I'm just saying PyTorch already lets you save/persist training data, maybe the author didn't know about it + python shell combo?
Really nice tool! I often develop in a similar pattern (though not quite as functional) using IPython and it's %autoreload magic. This looks a lot cleaner.
It does work in Python 2 with the following patch:
diff --git a/reloading/reloading.py b/reloading/reloading.py
index 1d28e2f..1a5f981 100644
--- a/reloading/reloading.py
+++ b/reloading/reloading.py
@@ -84,9 +84,10 @@ def reloading(seq):
exec(body)
except Exception:
exc = traceback.format_exc()
- exc = exc.replace('File "<string>"', f'File "{fpath}"')
+ exc = exc.replace('File "<string>"', 'File "{}"'.format(fpath))
sys.stderr.write(exc + '\n')
- input('Edit the file and press return to continue with the next iteration')
+ print('Edit the file and press return to continue with the next iteration')
+ sys.stdin.readline()
# copy locals back into the caller's locals
for k, v in locals().items():
FYI there's also reload(module) which is a built-in function.
One great use-case I've come across is in developing a network proxy for game packet interception, see:
Django's development mode aka "runserver" behaves a lot like this (as does a ton of different Node/JavaScript systems, many of which are much more precise on what sub-components to actually re-load). It's a big win for development pacing.
Being able to loop a specific segment of code looks like a great tool for my toolbox when writing data processing code, particularly for spatial data. I will have an input dataset and be iterating on the algorithm(s) as I watch the output (often rendered in QGIS) update dynamically but requiring me to re-run the test script.
Agreed, but to clarify-- django's runserver is a simple restart process upon file change. Webservers don't have heavy startup costs, so that's fine.
Being able to modify code during a run, where it has a heavy startup cost, is a different beast. I also wonder if Django's runserver could incorporate something like `reloading`, for even speedier reloads?