Hi,
We’re running L4T 24.1 for Jetson TX1, 64bit and have noticed a memory leak somewhere in OS. The memory can be released by logging out and logging back in (or reboot). For example, opening and closing gedit and monitoring the memory over time reveals the leak. After about 200 iterations, the system locks up. We see this behavior on two of our systems.
The script makes the leak pretty obvious. As an experiment on R24.1 64-bit, I ran xosview and htop while the script ran. For case 1, directly running the gedit open on the local display. For case 2, display is remote via ssh -Y with gedit popping up on my desktop host instead. Remote display does not trigger the leak. Local display is almost animated by how fast used memory goes up. The leak seems to be somewhere in the window manager or display software. I’m not sure what the significance is, but once the display manager crashes memory use immediately goes back to normal.
yes, i have also seen evidence of memory leak. i never did anything elaborate as creating scripts but just keeping applications such as gedit, chromium, files, and synaptic open over days would seemingly cause a memory problem when viewed with system monitor i could see the amount of available memory decrease and the amount of memory used by a specific app increase for no obvious reason. stuff like this is ongoing with 24.1 because it was similar in 23.2. of course the problem might be with the individual application. in 23.2 suspend to ram would work with sata, sd card in place, but quite often resuming was not possible. in 24.1 suspend to ram with sata, sd card in place is not possible on my install. i think all of this points to memory management issues either with tx1 or some apps. but i do not think it is new and unique to 24.1
What I found interesting was that there is no memory leak for GUI apps doing remote display to another machine, while the same app doing local display to the GUI has an obvious leak. The application itself can be ruled out. Components related to graphical display itself must be the leak source (that’s still a fairly broad set of software, but gedit does not use OpenGL for example, so the list narrows somewhat).
I’ve only experienced this issue with 24.1 64-bit (tested 23.2 and currently using 24.1 32 bit). Not sure what is the source of the problem, but it crashed mostly with gedit or any browser. It would also crash even when only running terminal windows for a couple of hours.
A crude way to bring it back is to ssh into jetson and restart lightdm, preferably using a script to do it automatically when memory runs out, but if someone can find out what is causing this behavior I would be grateful as I don’t really even know where to search.
The effect is easy to replicate, one must only leave the TX1 turned on for more than ~18 hours to see the system increase progressively in sluggishness. After more than a day and a half of operation, the TX1 virtually always requires a reboot to be usable. I’ve verified this still exists with a clean install using JetPack 2.2.1 standard installation.
Although I personally haven’t experienced the behavior described above (perhaps related to the applications I use during every day development), after further investigation the engineering team may have found an issue related to an extension of the display window manager. The fix is being patched and is slated for the next L4T update. Thank you for the reports from the community!
In case you need a faster way to manifest this issue, download a recent version of QT and build it:
wget http://download.qt.io/official_releases/qt/5.6/5.6.1/single/qt-everywhere-opensource-src-5.6.1.tar.gz
cd Downloads/qt-everywhere-opensource-src-5.6.1
sudo ./configure -prefix /usr/share/qt5 -confirm-license -qt-xcb -opengl es2 -nomake tests -nomake examples
sudo make
The build is about 6 hours, (don’t use multiple cores, multiple core builds with QT on the TX1 don’t work properly; even though you’ll think it built successfully, what you get out of the build isn’t usable.) By the end of the build, there is so little memory left for the system you’ll basically have to hard reset the device to get things going again.