[PonyORM-list] Algebra of sets

Vsevolod Novikov nnseva at gmail.com
Tue Oct 22 09:22:21 UTC 2013


Yes, I've found the .filter() method in sources but not found it in the
documentation, thank you for your explanations. It would be good and enough
replacement for intersection.

Just one inconvinience with .filter() is that the lambda parameter name
actually is not used to desemble an expression, and the previously used
generator parameter is used in lambda instead (the following example uses
djony wrapper):


>>> with orm.db_session:
...  users = orm.select(u for u in User.p)
...  me = users.filter(lambda u: u.username == 'seva')
...  for u in me: print u.username
...
seva
>>> with orm.db_session:
...  users = orm.select(u for u in User.p)
...  me = users.filter(lambda usr: usr.username == 'seva')
...  for u in me: print u.username
...
Traceback (most recent call last):
  File "<console>", line 3, in <module>
  File "<string>", line 2, in filter
  File "/usr/local/lib/python2.7/dist-packages/pony/utils.py", line 95, in
cut_traceback
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/pony/orm/core.py", line
4030, in filter
    throw(TranslationError, 'Unknown name %s' % name)
  File "/usr/local/lib/python2.7/dist-packages/pony/utils.py", line 121, in
throw
    raise exc
TranslationError: Unknown name usr

As you can see, the first code block uses the same parameter name in
generator and lambda and works fine, while the second one uses different
names and fails with exception, being fine from the point of labda syntax
and semantic view.

This inconvinience leads to difficulties in code isolation (lambda writer
should know name of the generator parameter, which might be hidden in other
subsystem written by other team).

Is this inconvinience particularly one which you meant, or something other?

As for the intersection, union and subtraction, these functions might be
constructed using logic combinations of filter expressions in involved
queries instead of using special SQL syntax to implement them IMHO (and
that is really what I meant before).

The only problem is analysing source records domain for both queries
involved to set expression. It might be fine restriction if you restrict
such expressions only for the same source record domains for both queries,
and use special UNION expression for different domains, mapped directly to
standard UNION expression of SQL.

For example:
select(d for d in Document if some_keyword in d.keywords) *&*
select(d for d in Document if d.owner == some_user)

is converted to (SQL)
SELECT * FROM Document WHERE keywords like CONCAT('%',$some_keyword,'%') *
AND* owner = $some_user

While

select(d for d in Document if some_keyword in d.keywords) *|*
select(d for d in Document if d.owner == some_user)

is converted to
SELECT * FROM Document WHERE keywords like CONCAT('%',$some_keyword,'%') *OR
* owner = $some_user

But
select(d for d in Document if some_keyword in d.keywords)*.union(*
select(d for d in Document if d.owner == some_user))

is converted to
SELECT * FROM Document WHERE keywords like CONCAT('%',$some_keyword,'%')
UNION
SELECT * FROM Document WHERE owner = $some_user

The first two cases might be implemented easy using existent .filter()
functionality.

Regards,
Vsevolod




2013/10/21 Alexander Kozlovsky <alexander.kozlovsky at gmail.com>

> Hi Vsevolod!
>
> I think it is possible to implement this feature in Pony ORM with the
> following API:
>
> q1 = select(s for s in Student if ...)
> q2 = select(s for s in Student if ...)
> q3 = q1.union(q2)
> q4 = q1.intersect(q2)
> q5 = q1.except(q2)
>
> The implementation will take 3-5 work days (or may be longer, because
> MySQL has no direct support of INTERSECT and EXCEPT and requires to write a
> workaround via joins), and currently we have more urgent tasks (such as
> migration support), so I don't think this feature can be implemented during
> this year. Please add this feature to our issue list on GitHub, and we'll
> implement it when time permits.
>
> By the way, I fear that some programmers may overuse such 'algebra' and as
> a result construct inefficient queries. Let's consider the next query:
>
> SELECT * FROM T1 WHERE X > 100
> INTERSECT
> SELECT * FROM T1 WHERE Y < 200
>
> In many situations, the next equivalent query will be faster:
>
> SELECT * FROM T1 WHERE X > 100 and Y < 200
>
> So in case of implementing user-restricted document search, it may be more
> efficient to implement it in the following way, via incremental filtering
> of the same query:
>
> def document_search(current_user, specified_keyword=None,
>         part_of_title=None, document_type=None,
>         min_size=None, max_size=None):
>
>     query = select(d for d in Documents
>         if d.owner == current_user or current_user in d.editors)
>
>     if specified_keyword is not None:
>         query.filter(lambda d: specified_keyword in d.keywords)
>
>     if part_of_title is not None:
>         query.filter(lambda d: part_of_title in d.title)
>
>     if document_type is not None:
>         query.filter(lambda d: d.type = document_type)
>
>     if min_size is not None:
>         query.filter(lambda d: d.size >= min_size)
>
>     if max_size is not None:
>         query.filter(lambda d: d.size <= max_size)
>
>     return query
>
> This way of filtering should works in the development version of Pony (it
> was accidentally broken in the last release 0.4.8, but will be restored in
> the upcoming release 0.4.9)
>
>
>
> On Mon, Oct 21, 2013 at 2:27 PM, Vsevolod Novikov <nnseva at gmail.com>wrote:
>
>> Hi All,
>>
>> It would be useful to implement some kind of 'algebra of sets':
>> intersection, union, and subtraction of sets defined by two abstract
>> queries.
>>
>> Let I am writing a subsystem which should accept some queries A and B and
>> do something with it's intersection. Is it possible someway?
>>
>> The particular (simplified) usecase: Let we have Users, Groups, and
>> Documents belonging to User (individual), or Group (shared).
>>
>> We have wrote a 'user access' subsystem which restricts access to
>> Documents and returns some DocumentSet (query?) object (document_set_a)
>> which refers to all Documents belonging to this User directly or through
>> the Group.
>>
>> Also we have wrote some other subsystem, let it be 'document index',
>> which returns other DocumentSet (query?) object (document_set_b) which
>> refers to all Documents containing passed keywords.
>>
>> Now we want to write user-restricted document index: the user passes us
>> keywords and wants to see all Documents belonging to him and containing
>> these keywords. So we should return a DocumentSet which is an intersection
>> of document_set_a and document_set_b.
>>
>> Regards,
>> Vsevolod
>>
>>
>> _______________________________________________
>> ponyorm-list mailing list
>> ponyorm-list at ponyorm.com
>> /ponyorm-list
>>
>>
>
> _______________________________________________
> ponyorm-list mailing list
> ponyorm-list at ponyorm.com
> /ponyorm-list
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </ponyorm-list/attachments/20131022/939faac5/attachment-0001.html>


More information about the ponyorm-list mailing list