<div dir="ltr"><div><div><div><div>Hi Alexander,<br><br></div>With regards to:<br><br>In principle, I'm for pruning cache more aggressively on db_session
exit, but unfortunately some people like to continue working with
objects after exiting from db_session (for example, generate HTML
content using some template engine, etc.), although in my opinion it is
more correct to perform such actions inside db_session.<br><br></div>I think it's important to be strict with dropping the cache on session exit, it's a pain to have scripts that do decent amounts of work on lots of objects blow up to many gigabytes of ram usage. I then have to produce workarounds, like restarting processes often, which is a big pain><br><br></div>Thanks for your work.<br><br></div>Matt<br></div><div class="gmail_extra"><br><div class="gmail_quote">On 15 April 2016 at 19:28, Alexander Kozlovsky <span dir="ltr"><<a href="mailto:alexander.kozlovsky@gmail.com" target="_blank">alexander.kozlovsky@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Thanks for the suggestion, I'll think how to implement cache pruning.<br><br>Regarding timing of queries, Pony already does collect such information, regardless of the `sql_debug` state. A Database object has a thread-local property `db.local_stats` which contains statistics information about current thread, and also can be used for a single-threaded application. The property value is a dict, where keys are SQL queries and values are special QueryStat objects. Each QueryStat object has the following attributes:<br><br><div>- `sql` - the text of SQL query</div><div>- `db_count` - the number of times this query was send to the database</div><div>- `cache_count` - the number of time the query result was taken directly from the db_session cache (for cases when a query was called repeatedly inside the same db_session)</div><div>- `min_time`, `max_time`, 'avg_time' - the time required for database to execute the query</div><div>- `sum_time` - total time spent (should be equal to `avg_time` * `db_count`)</div><br><div>So you can do something like that:</div><div><br></div><div> query_stats = sorted(db.local_stats.values(), reverse=True, key=attrgetter('sum_time'))</div><div> for qs in query_stats:</div><div> print(qs.sum_time, qs.db_count, qs.sql)</div><br>If you call the method `db.merge_local_stats()` then the content of `db.local_stats` will be merged to `db.global_stats`, and `db.local_stats` will be cleared. If you are writing a web application you can call `db.merge_local_stats()` when you finish processing HTTP request in order to clear `db.local_stats` before processing of the next request. `db.global_stats` property can be used in multi-threaded application in order to get total statistics over all threads.<br><br>Hope that helps<br><br></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Apr 15, 2016 at 8:49 PM, Matthew Bell <span dir="ltr"><<a href="mailto:matthewrobertbell@gmail.com" target="_blank">matthewrobertbell@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div><div><div>Hi Alex,<br><br></div>I don't believe any objects were leaking out of the session, the only thing i store between sessions is integers (object IDs). I have solved this problem myself by doing the work in python-rq jobs, rather than in one big script, however it would be great to have some sort of "force clear the cache" functionality - ideally, as you say having it strictly happen upon leaving session scope.<br><br></div>Also useful for some niche situations would be having the option to disable caching for a given session.<br><br></div>Another suggestion which is unrelated - an option or default of the timing of queries when using sql_debug(True) - this would make performance profiling much simpler, especially in web apps where many queries happen on a given request.<br><br></div>Thanks for your work!<br><br></div>Matt<br></div><div class="gmail_extra"><div><div><br><div class="gmail_quote">On 15 April 2016 at 16:28, Alexander Kozlovsky <span dir="ltr"><<a href="mailto:alexander.kozlovsky@gmail.com" target="_blank">alexander.kozlovsky@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi Matthew!<br><br>At first sight it looks like a memory leak. Also it is possible that bigger values of x in your loop retrieve larger number of objects and hence require more memory?<br><br>Regarding memory leak: after db_session is over, Pony releases pointer to session cache, and in the best case all cache content will be gathered by garbage collector. But if your code still holds a pointer to some object in the cache, that will prevent garbage collection, because objects inside a cache are interconnected. Are you holding some pointers to objects from previous db sessions?<br><br>It is possible that we have some memory leak inside Pony, but right now we are not aware of it.<br><br>You mentioned in one of your previous messages that in your code you perform cascade deletion of multiple objects, which all are loaded into memory. Does you current program perform something like that?<br><br>In principle, I'm for pruning cache more aggressively on db_session exit, but unfortunately some people like to continue working with objects after exiting from db_session (for example, generate HTML content using some template engine, etc.), although in my opinion it is more correct to perform such actions inside db_session.<br><br><br>Regards,<br>Alexander<br><br></div><div class="gmail_extra"><br><div class="gmail_quote"><div><div>On Thu, Apr 14, 2016 at 10:46 PM, Matthew Bell <span dir="ltr"><<a href="mailto:matthewrobertbell@gmail.com" target="_blank">matthewrobertbell@gmail.com</a>></span> wrote:<br></div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div><div dir="ltr"><div><div><div><div><div><div><div>Hi,<br><br></div>I have code like:<br><br></div>for x in list_of_ints:<br></div> with db_session:<br></div> # do lots of database processing tied to x<br><br></div>I am doing it like this to stop the pony cache from using a lot of memory, but cache usage still grows over time. How can I stop this happening?<br><br></div>Thanks,<br><br></div>Matt<span><font color="#888888"><br clear="all"><div><div><div><div><div><div><div><div><br>-- <br><div>Regards,<br><br>Matthew Bell<br></div>
</div></div></div></div></div></div></div></div></font></span></div>
<br></div></div>_______________________________________________<br>
ponyorm-list mailing list<br>
<a href="mailto:ponyorm-list@ponyorm.org" target="_blank">ponyorm-list@ponyorm.org</a><br>
<a href="/ponyorm-list" rel="noreferrer" target="_blank">/ponyorm-list</a><br>
<br></blockquote></div><br></div>
<br>_______________________________________________<br>
ponyorm-list mailing list<br>
<a href="mailto:ponyorm-list@ponyorm.org" target="_blank">ponyorm-list@ponyorm.org</a><br>
<a href="/ponyorm-list" rel="noreferrer" target="_blank">/ponyorm-list</a><br>
<br></blockquote></div><br><br clear="all"><br></div></div><span><font color="#888888">-- <br><div>Regards,<br><br>Matthew Bell<br></div>
</font></span></div>
<br>_______________________________________________<br>
ponyorm-list mailing list<br>
<a href="mailto:ponyorm-list@ponyorm.org" target="_blank">ponyorm-list@ponyorm.org</a><br>
<a href="/ponyorm-list" rel="noreferrer" target="_blank">/ponyorm-list</a><br>
<br></blockquote></div><br></div>
</div></div><br>_______________________________________________<br>
ponyorm-list mailing list<br>
<a href="mailto:ponyorm-list@ponyorm.org">ponyorm-list@ponyorm.org</a><br>
<a href="/ponyorm-list" rel="noreferrer" target="_blank">/ponyorm-list</a><br>
<br></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature">Regards,<br><br>Matthew Bell<br></div>
</div>