SQLAlchemy 0.3 Documentation

Basic Data Mapping

Data mapping describes the process of defining Mapper objects, which associate table metadata with user-defined classes.

The Mapper's role is to perform SQL operations upon the database, associating individual table rows with instances of those classes, and individual database columns with properties upon those instances, to transparently associate in-memory objects with a persistent database representation.

When a Mapper is created to associate a Table object with a class, all of the columns defined in the Table object are associated with the class via property accessors, which add overriding functionality to the normal process of setting and getting object attributes. These property accessors keep track of changes to object attributes; these changes will be stored to the database when the application "flushes" the current state of objects (known as a Unit of Work).

Two objects provide the primary interface for interacting with Mappers and the "unit of work", which are the Query object and the Session object. Query deals with selecting objects from the database, whereas Session provides a context for loaded objects and the ability to communicate changes on those objects back to the database.

The primary method on Query for loading objects is its select() method, which has similar arguments to a sqlalchemy.sql.Select object. But this select method executes automatically and returns results, instead of awaiting an execute() call. Instead of returning a cursor-like object, it returns an array of objects.

The three configurational elements to be defined, i.e. the Table metadata, the user-defined class, and the Mapper, are typically defined as module-level variables, and may be defined in any fashion suitable to the application, with the only requirement being that the class and table metadata are described before the mapper. For the sake of example, we will be defining these elements close together, but this should not be construed as a requirement; since SQLAlchemy is not a framework, those decisions are left to the developer or an external framework.

Also, keep in mind that the examples in this section deal with explicit Session objects mapped directly to Engine objects, which represents the most explicit style of using the ORM. Options exist for how this is configured, including binding Table objects directly to Engines (described in Binding MetaData to an Engine), as well as using the "Threadlocal" plugin which provides various code shortcuts by using an implicit Session associated to the current thread (described in threadlocal).

back to section top

Synopsis

First, the metadata/mapper configuration code:

from sqlalchemy import *

# metadata
meta = MetaData()

# table object
users_table = Table('users', meta, 
    Column('user_id', Integer, primary_key=True),
    Column('user_name', String(16)),
    Column('password', String(20))
)

# class definition 
class User(object):
    pass

# create a mapper and associate it with the User class.
# technically we dont really need the 'usermapper' variable.
usermapper = mapper(User, users_table)

Note that no database definitions are required. Next we will define an Engine and connect a Session to it, and perform a simple select:

# engine
engine = create_engine("sqlite://mydb.db")

# session
session = create_session(bind_to=engine)

# select
sqluser = session.query(User).select_by(user_name='fred')[0]

# modify
user.user_name = 'fred jones'

# flush - saves everything that changed
sqlsession.flush()

back to section top

The Query Object

The method session.query(class_or_mapper) returns a Query object. Below is a synopsis of things you can do with Query:

# get a query from a Session based on class:
query = session.query(User)

# get a query from a Session given a Mapper:
query = session.query(usermapper)

# select_by, which takes keyword arguments.  the
# keyword arguments represent property names and the values
# represent values which will be compared via the = operator.
# the comparisons are joined together via "AND".
result = query.select_by(name='john', street='123 green street')

# select_by can also combine ClauseElements with key/value properties.
# all ClauseElements and keyword-based criterion are combined together
# via "AND". 
result = query.select_by(users_table.c.user_name=='john', 
        addresses_table.c.zip_code=='12345', street='123 green street')

# get_by, which takes the same arguments as select_by
# returns a single scalar result or None if no results
user = query.get_by(id=12)

# "dynamic" versions of select_by and get_by - everything past the 
# "select_by_" or "get_by_" is used as the key, and the function argument
# as the value
result = query.select_by_name('fred')
u = query.get_by_name('fred')

# get an object directly from its primary key.  this will bypass the SQL
# call if the object has already been loaded
u = query.get(15)

# get an object that has a composite primary key of three columns.
# the order of the arguments matches that of the table meta data.
myobj = query.get((27, 3, 'receipts'))

# using a WHERE criterion
result = query.select(or_(users_table.c.user_name == 'john', users_table.c.user_name=='fred'))

# using a WHERE criterion to get a scalar
u = query.selectfirst(users_table.c.user_name=='john')

# selectone() is a stricter version of selectfirst() which
# will raise an exception if there is not exactly one row
u = query.selectone(users_table.c.user_name=='john')

# using a full select object
result = query.select(users_table.select(users_table.c.user_name=='john'))

Some of the above examples above illustrate the usage of the mapper's Table object to provide the columns for a WHERE Clause. These columns are also accessible off of the mapped class directly. When a mapper is assigned to a class, it also attaches a special property accessor c to the class itself, which can be used just like the table metadata to access the columns of the table:

userlist = session.query(User).select(User.c.user_id==12)

Full documentation for Query's API : Query.

back to section top

Saving Objects

When objects corresponding to mapped classes are created or manipulated, all changes are logged by the Session object. The changes are then written to the database when an application calls flush(). This pattern is known as a Unit of Work, and has many advantages over saving individual objects or attributes on those objects with individual method invocations. Domain models can be built with far greater complexity with no concern over the order of saves and deletes, excessive database round-trips and write operations, or deadlocking issues. The flush() operation batches its SQL statements into a transaction, and can also perform optimistic concurrency checks (using a version id column) to ensure the proper number of rows were in fact affected (not supported with the current MySQL drivers).

The Unit of Work is a powerful tool, and has some important concepts that should be understood in order to use it effectively. See the Session / Unit of Work section for a full description on all its operations.

When a mapper is created, the target class has its mapped properties decorated by specialized property accessors that track changes. New objects by default must be explicitly added to the Session, however this can be made automatic by using threadlocal or SessionContext.

mapper(User, users_table)

# create a new User
myuser = User()
myuser.user_name = 'jane'
myuser.password = 'hello123'

# create another new User      
myuser2 = User()
myuser2.user_name = 'ed'
myuser2.password = 'lalalala'

# create a Session and save them
sess = create_session()
sess.save(myuser)
sess.save(myuser2)

# load a third User from the database            
sqlmyuser3 = sess.query(User).select(User.c.user_name=='fred')[0]

myuser3.user_name = 'fredjones'

# save all changes            
sqlsession.flush()

The mapped class can also specify whatever methods and/or constructor it wants:

class User(object):
    def __init__(self, user_name, password):
        self.user_id = None
        self.user_name = user_name
        self.password = password
    def get_name(self):
        return self.user_name
    def __repr__(self):
        return "User id %s name %s password %s" % (repr(self.user_id), 
            repr(self.user_name), repr(self.password))
mapper(User, users_table)

sess = create_session()
u = User('john', 'foo')
sess.save(u)
sqlsession.flush()

>>> u
User id 1 name 'john' password 'foo'

Note that the __init__() method is not called when the instance is loaded. This is so that classes can define operations that are specific to their initial construction which are not re-called when the object is restored from the database, and is similar in concept to how Python's pickle module calls __new__() when deserializing instances. To allow __init__() to be called at object load time, or to define any other sort of on-load operation, create a MapperExtension which supplies the create_instance() method (see Extending Mapper, as well as the example in the FAQ).

SQLAlchemy will only put modified object attributes columns into the UPDATE statements generated upon flush. This is to conserve database traffic and also to successfully interact with a "deferred" attribute, which is a mapped object attribute against the mapper's primary table that isnt loaded until referenced by the application.

back to section top

Defining and Using Relationships

So that covers how to map the columns in a table to an object, how to load objects, create new ones, and save changes. The next step is how to define an object's relationships to other database-persisted objects. This is done via the relation function provided by the orm module.

One to Many

So with our User class, lets also define the User has having one or more mailing addresses. First, the table metadata:

from sqlalchemy import *

metadata = MetaData()

# define user table
users_table = Table('users', metadata, 
    Column('user_id', Integer, primary_key=True),
    Column('user_name', String(16)),
    Column('password', String(20))
)

# define user address table
addresses_table = Table('addresses', metadata,
    Column('address_id', Integer, primary_key=True),
    Column('user_id', Integer, ForeignKey("users.user_id")),
    Column('street', String(100)),
    Column('city', String(80)),
    Column('state', String(2)),
    Column('zip', String(10))
)

Of importance here is the addresses table's definition of a foreign key relationship to the users table, relating the user_id column into a parent-child relationship. When a Mapper wants to indicate a relation of one object to another, the ForeignKey relationships are the default method by which the relationship is determined (options also exist to describe the relationships explicitly).

So then lets define two classes, the familiar User class, as well as an Address class:

class User(object):
    def __init__(self, user_name, password):
        self.user_name = user_name
        self.password = password

class Address(object):
    def __init__(self, street, city, state, zip):
        self.street = street
        self.city = city
        self.state = state
        self.zip = zip

And then a Mapper that will define a relationship of the User and the Address classes to each other as well as their table metadata. We will add an additional mapper keyword argument properties which is a dictionary relating the names of class attributes to database relationships, in this case a relation object against a newly defined mapper for the Address class:

mapper(Address, addresses_table)
mapper(User, users_table, properties = {
        'addresses' : relation(Address)
    }
  )

Lets do some operations with these classes and see what happens:

engine = create_engine('sqlite:///mydb.db')
metadata.create_all(engine)

session = create_session(bind_to=engine)

u = User('jane', 'hihilala')
u.addresses.append(Address('123 anywhere street', 'big city', 'UT', '76543'))
u.addresses.append(Address('1 Park Place', 'some other city', 'OK', '83923'))

session.save(u)
session.flush()

INSERT INTO users (user_name, password) VALUES (:user_name, :password)
{'password': 'hihilala', 'user_name': 'jane'}
INSERT INTO addresses (user_id, street, city, state, zip) VALUES (:user_id, :street, :city, :state, :zip)
{'city': 'big city', 'state': 'UT', 'street': '123 anywhere street', 'user_id':1, 'zip': '76543'}
INSERT INTO addresses (user_id, street, city, state, zip) VALUES (:user_id, :street, :city, :state, :zip)
{'city': 'some other city', 'state': 'OK', 'street': '1 Park Place', 'user_id':1, 'zip': '83923'}

A lot just happened there! The Mapper figured out how to relate rows in the addresses table to the users table, and also upon flush had to determine the proper order in which to insert rows. After the insert, all the User and Address objects have their new primary and foreign key attributes populated.

Also notice that when we created a Mapper on the User class which defined an addresses relation, the newly created User instance magically had an "addresses" attribute which behaved like a list. This list is in reality a property function which returns an instance of sqlalchemy.util.HistoryArraySet. This object fulfills the full set of Python list accessors, but maintains a unique set of objects (based on their in-memory identity), and also tracks additions and deletions to the list:

del u.addresses[1]
u.addresses.append(Address('27 New Place', 'Houston', 'TX', '34839'))

session.flush()

UPDATE addresses SET user_id=:user_id
WHERE addresses.address_id = :addresses_address_id
[{'user_id': None, 'addresses_address_id': 2}]
INSERT INTO addresses (user_id, street, city, state, zip)
VALUES (:user_id, :street, :city, :state, :zip)
{'city': 'Houston', 'state': 'TX', 'street': '27 New Place', 'user_id': 1, 'zip': '34839'}

Note that when creating a relation with the relation() function, the target can either be a class, in which case the primary mapper for that class is used as the target, or a Mapper instance itself, as returned by the mapper() function.

back to section top

Lifecycle Relations

In the previous example, a single address was removed from the addresses attribute of a User object, resulting in the corresponding database row being updated to have a user_id of None. But now, theres a mailing address with no user_id floating around in the database of no use to anyone. How can we avoid this ? This is acheived by using the cascade parameter of relation:

clear_mappers()  # clear mappers from the previous example
mapper(Address, addresses_table)
mapper(User, users_table, properties = {
        'addresses' : relation(Address, cascade="all, delete-orphan")
    }
  )

del u.addresses[1]
u.addresses.append(Address('27 New Place', 'Houston', 'TX', '34839'))

session.flush()

INSERT INTO addresses (user_id, street, city, state, zip)
VALUES (:user_id, :street, :city, :state, :zip)
{'city': 'Houston', 'state': 'TX', 'street': '27 New Place', 'user_id': 1, 'zip': '34839'}
DELETE FROM addresses WHERE addresses.address_id = :address_id
[{'address_id': 2}]

In this case, with the delete-orphan cascade rule set, the element that was removed from the addresses list was also removed from the database. Specifying cascade="all, delete-orphan" means that every persistence operation performed on the parent object will be cascaded to the child object or objects handled by the relation, and additionally that each child object cannot exist without being attached to a parent. Such a relationship indicates that the lifecycle of the Address objects are bounded by that of their parent User object.

Cascading is described fully in Cascade rules.

back to section top

Backreferences

By creating relations with the backref keyword, a bi-directional relationship can be created which will keep both ends of the relationship updated automatically, independently of database operations. Below, the User mapper is created with an addresses property, and the corresponding Address mapper receives a "backreference" to the User object via the property name user:

Address = mapper(Address, addresses_table)
User = mapper(User, users_table, properties = {
                'addresses' : relation(Address, backref='user')
            }
          )

u = User('fred', 'hi')
a1 = Address('123 anywhere street', 'big city', 'UT', '76543')
a2 = Address('1 Park Place', 'some other city', 'OK', '83923')

# append a1 to u
u.addresses.append(a1)

# attach u to a2
a2.user = u

# the bi-directional relation is maintained
>>> u.addresses == [a1, a2]
True
>>> a1.user is user and a2.user is user
True

The backreference feature also works with many-to-many relationships, which are described later. When creating a backreference, a corresponding property (i.e. a second relation()) is placed on the child mapper. The default arguments to this property can be overridden using the backref() function:

mapper(User, users_table)
mapper(Address, addresses_table, properties={
    'user':relation(User, backref=backref('addresses', cascade="all, delete-orphan"))
})

The backref() function is often used to set up a bi-directional one-to-one relationship. This is because the relation() function by default creates a "one-to-many" relationship when presented with a primary key/foreign key relationship, but the backref() function can redefine the uselist property to make it a scalar:

mapper(User, users_table)
mapper(Address, addresses_table, properties={
    'user' : relation(User, backref=backref('address', uselist=False))
})

back to section top

Selecting from Relationships

We've seen how the relation specifier affects the saving of an object and its child items, how does it affect selecting them? By default, the relation keyword indicates that the related property should be attached a lazy loader when instances of the parent object are loaded from the database; this is just a callable function that when accessed will invoke a second SQL query to load the child objects of the parent.

# define a mapper
mapper(User, users_table, properties = {
      'addresses' : relation(mapper(Address, addresses_table))
    })

# select users where username is 'jane', get the first element of the list
# this will incur a load operation for the parent table
sqluser = session.query(User).select(User.c.user_name=='jane')[0]

# iterate through the User object's addresses.  this will incur an
# immediate load of those child items
sqlfor a in user.addresses:

print repr(a)

Selecting With Joins

When using mappers that have relationships to other mappers, the need to specify query criterion across multiple tables arises. SQLAlchemy provides several core techniques which offer this functionality.

When specifying columns to the select method of Query, if the columns are attached to a table other than the mapped table, that table is automatically added to the "FROM" clause of the query. This is the same behavior that occurs when creating a non-ORM select object. Using this feature, joins can be created when querying:

sqll = session.query(User).select(and_(users.c.user_id==addresses.c.user_id, 
             addresses.c.street=='123 Green Street'))

Above, we specified selection criterion that included columns from both the users and the addresses table. Note that in this case, we had to specify not just the matching condition to the street column on addresses, but also the join condition between the users and addresses table. Even though the User mapper has a relationship to the Address mapper where the join condition is known, the select method does not assume how you want to construct joins. If we did not include the join clause, we would get:

# this is usually not what you want to do
sqll = session.query(User).select(addresses.c.street=='123 Green Street')

The above join will return all rows of the users table, even those that do not correspond to the addresses table, in a cartesian product with the matching rows from the addresses table.

Another way to specify joins more explicitly is to use the from_obj parameter of select(). This allows you to explicitly place elements in the FROM clause of the query, which could include lists of tables and/or Join constructs:

sqll = session.query(User).select(addresses.c.street=='123 Green Street', 
                from_obj=[users.join(addresses)])

In the above example, the join function by default creates a natural join between the two tables, so we were able to avoid having to specify the join condition between users and addresses explicitly.

back to section top

Creating Joins Using select_by()

Another way that joins can be created is by using the select_by method of Query, which has the ability to create joins across relationships automatically. This method is in many circumstances more convenient than, but not as flexible as, the more SQL-level approach using the select() method described in the previous section.

To issue a join using select_by, just specify a key in the argument list which is not present in the primary mapper's list of properties or columns, but is present in the property list of some relationship down the line of objects. The Query object will recursively traverse along the mapped relationships starting with the lead class and descending into child classes, until it finds a property matching the given name. For each new mapper it encounters along the path to the located property, it constructs a join across that relationship:

sqll = session.query(User).select_by(street='123 Green Street')

The above example is shorthand for:

l = session.query(User).select(and_(
         Address.c.user_id==User.c.user_id, 
         Address.c.street=='123 Green Street')
   )

All keyword arguments sent to select_by are used to create query criterion. This means that familiar select keyword options like order_by and limit are not directly available. To enable these options with select_by, you can try the SelectResults extension which offers transitive methods off the result of a select or select_by such as order_by(), limit(), etc.

Note that the select_by method, while it primarily uses keyword arguments, also can accomodate ClauseElement objects positionally; recall that a ClauseElement is genearated when producing a comparison off of a Column expression, such as users.c.name=='ed'. When using ClauseElements with select_by, these clauses are passed directly to the generated SQL and are not used to further locate join criterion. If criterion is being constructed with these kinds of expressions, consider using the select() method which is better designed to accomodate these expressions.

As of SA 0.3.4, select_by() and related functions can compare not only column-based attributes to column-based values, but also relations to object instances:

# get an instance of Address
someaddress = session.query(Address).get_by(street='123 Green Street')

# look for User instances which have the 
# "someaddress" instance in their "addresses" collection
l = session.query(User).select_by(addresses=someaddress)

Where above, the comparison denoted by addresses=someaddress is constructed by comparing all the primary key columns in the Address mapper to each corresponding primary key value in the someaddress entity. In other words, its equivalent to saying select_by(address_id=someaddress.address_id) (Alpha API).

back to section top

Generating Join Criterion Using join_to, join_via

Feature Status: Alpha API

The join_to method of Query is a component of the select_by operation, and is given a keyname in order to return a "join path" from the Query's mapper to the mapper which is referenced by a relation() of the given name:

>>> q = session.query(User)
>>> j = q.join_to('addresses')
>>> print j
users.user_id=addresses.user_id

join_to can also be given the name of a column-based property, in which case it will locate a path to the nearest mapper which has that property as a column:

>>> q = session.query(User)
>>> j = q.join_to('street')
>>> print j
users.user_id=addresses.user_id

Also available is the join_via function, which is similar to join_to, except instead of traversing through all properties to find a path to the given key, its given an explicit path to the target property:

>>> q = session.query(User)
>>> j = q.join_via(['orders', 'items'])
>>> print j
users.c.user_id==orders.c.user_id AND orders.c.item_id==items.c.item_id

Expressions produced by join_to and join_via can be used with select to create more complicated query criterion across multiple relations:

>>> l = q.select(
    (addresses_table.c.street=='some address') &
    (items_table.c.item_name=='item #4') &
    q.join_to('addresses') &
    q.join_via(['orders', 'items'])
    )

Note that the from_obj parameter of select(), described previously, allows finer grained control of joins, allowing any combination of inner and outer joins.

back to section top

Eager Loading

Eager Loading describes the loading of parent and child objects across a relation using a single query. The purpose of eager loading is strictly one of performance enhancement; eager loading has no impact on the results of a query, except that when traversing child objects within the results, lazy loaders will not need to issue separate queries to load those child objects.

Eager Loading is enabled on a per-relationship basis, either as the default for a particular relationship, or for a single query using query options, described later.

With just a single parameter lazy=False specified to the relation object, the parent and child SQL queries can be joined together.

mapper(Address, addresses_table)
mapper(User, users_table, properties = {
        'addresses' : relation(Address, lazy=False)
    }
  )

sqlusers = session.query(User).select(User.c.user_name=='Jane')

for u in users:
    print repr(u)
    for a in u.addresses:
        print repr(a)

Above, a pretty ambitious query is generated just by specifying that the User should be loaded with its child Addresses in one query. When the mapper processes the results, it uses an Identity Map to keep track of objects that were already loaded, based on their primary key identity. Through this method, the redundant rows produced by the join are organized into the distinct object instances they represent.

Recall that eager loading has no impact on the results of the query. What if our query included our own join criterion? The eager loading query accomodates this using aliases, and is immune to the effects of additional joins being specified in the original query. To use our select_by example above, joining against the "addresses" table to locate users with a certain street results in this behavior:

sqlusers = session.query(User).select_by(street='123 Green Street')

The join implied by passing the "street" parameter is separate from the join produced by the eager join, which is "aliasized" to prevent conflicts.

back to section top

Using Options to Change the Loading Strategy

The options method on the Query object provides an easy way to get alternate forms of a mapper query from an original one. The most common use of this feature is to change the "eager/lazy" loading behavior of a particular mapper, via the functions eagerload(), lazyload() and noload():

# user mapper with lazy addresses
mapper(User, users_table, properties = {
             'addresses' : relation(mapper(Address, addresses_table))
         }
)

# query object
query = session.query(User)

# make an eager loading query
eagerquery = query.options(eagerload('addresses'))
u = eagerquery.select()

# make another query that wont load the addresses at all
plainquery = query.options(noload('addresses'))

# multiple options can be specified
myquery = oldquery.options(lazyload('tracker'), noload('streets'), eagerload('members'))

# to specify a relation on a relation, separate the property names by a "."
myquery = oldquery.options(eagerload('orders.items'))

back to section top

One to One/Many to One

The above examples focused on the "one-to-many" relationship. To do other forms of relationship is easy, as the relation function can usually figure out what you want:

metadata = MetaData()

# a table to store a user's preferences for a site
prefs_table = Table('user_prefs', metadata,
    Column('pref_id', Integer, primary_key = True),
    Column('stylename', String(20)),
    Column('save_password', Boolean, nullable = False),
    Column('timezone', CHAR(3), nullable = False)
)

# user table with a 'preference_id' column
users_table = Table('users', metadata, 
    Column('user_id', Integer, primary_key = True),
    Column('user_name', String(16), nullable = False),
    Column('password', String(20), nullable = False),
    Column('preference_id', Integer, ForeignKey("user_prefs.pref_id"))
)

# engine and some test data
engine = create_engine('sqlite:///', echo=True)
metadata.create_all(engine)
engine.execute(prefs_table.insert(), dict(pref_id=1, stylename='green', save_password=1, timezone='EST'))
engine.execute(users_table.insert(), dict(user_name = 'fred', password='45nfss', preference_id=1))

# classes
class User(object):
    def __init__(self, user_name, password):
        self.user_name = user_name
        self.password = password

class UserPrefs(object):
    pass

mapper(UserPrefs, prefs_table)

mapper(User, users_table, properties = dict(
    preferences = relation(UserPrefs, lazy=False, cascade="all, delete-orphan"),
))

# select
session = create_session(bind_to=engine)
sqluser = session.query(User).get_by(user_name='fred')

save_password = user.preferences.save_password

# modify
user.preferences.stylename = 'bluesteel'

# flush
sqlsession.flush()

back to section top

Many to Many

The relation function handles a basic many-to-many relationship when you specify the association table:

metadata = MetaData()

articles_table = Table('articles', metadata,
    Column('article_id', Integer, primary_key = True),
    Column('headline', String(150), key='headline'),
    Column('body', TEXT, key='body'),
)

keywords_table = Table('keywords', metadata,
    Column('keyword_id', Integer, primary_key = True),
    Column('keyword_name', String(50))
)

itemkeywords_table = Table('article_keywords', metadata,
    Column('article_id', Integer, ForeignKey("articles.article_id")),
    Column('keyword_id', Integer, ForeignKey("keywords.keyword_id"))
)

engine = create_engine('sqlite:///')
metadata.create_all(engine)

# class definitions
class Keyword(object):
    def __init__(self, name):
        self.keyword_name = name

class Article(object):
    pass

mapper(Keyword, keywords_table)

# define a mapper that does many-to-many on the 'itemkeywords' association 
# table
mapper(Article, articles_table, properties = dict(
    keywords = relation(Keyword, secondary=itemkeywords_table, lazy=False)
    )
)

session = create_session(bind_to=engine)

article = Article()
article.headline = 'a headline'
article.body = 'this is the body'
article.keywords.append(Keyword('politics'))
article.keywords.append(Keyword('entertainment'))
session.save(article)

sqlsession.flush()

# select articles based on a keyword.  select_by will handle the extra joins.
sqlarticles = session.query(Article).select_by(keyword_name='politics')

a = articles[0]

# clear out keywords with a new list
a.keywords = []
a.keywords.append(Keyword('topstories'))
a.keywords.append(Keyword('government'))

# flush
sqlsession.flush()

back to section top

Association Object

Many to Many can also be done with an association object, that adds additional information about how two items are related. In this pattern, the "secondary" option to relation() is no longer used; instead, the association object becomes a mapped entity itself, mapped to the association table. If the association table has no explicit primary key columns defined, you also have to tell the mapper what columns will compose its "primary key", which are typically the two (or more) columns involved in the association. Also, the relation between the parent and association mapping is typically set up with a cascade of all, delete-orphan. This is to ensure that when an association object is removed from its parent collection, it is deleted (otherwise, the unit of work tries to null out one of the foreign key columns, which raises an error condition since that column is also part of its "primary key").

from sqlalchemy import *
metadata = MetaData()

users_table = Table('users', metadata, 
    Column('user_id', Integer, primary_key = True),
    Column('user_name', String(16), nullable = False),
)

articles_table = Table('articles', metadata,
    Column('article_id', Integer, primary_key = True),
    Column('headline', String(150), key='headline'),
    Column('body', TEXT, key='body'),
)

keywords_table = Table('keywords', metadata,
    Column('keyword_id', Integer, primary_key = True),
    Column('keyword_name', String(50))
)

# add "attached_by" column which will reference the user who attached this keyword
itemkeywords_table = Table('article_keywords', metadata,
    Column('article_id', Integer, ForeignKey("articles.article_id")),
    Column('keyword_id', Integer, ForeignKey("keywords.keyword_id")),
    Column('attached_by', Integer, ForeignKey("users.user_id"))
)

engine = create_engine('sqlite:///', echo=True)
metadata.create_all(engine)

# class definitions
class User(object):
    pass
class Keyword(object):
    def __init__(self, name):
        self.keyword_name = name
class Article(object):
    pass
class KeywordAssociation(object):
    pass

# Article mapper, relates to Keyword via KeywordAssociation
mapper(Article, articles_table, properties={
    'keywords':relation(KeywordAssociation, lazy=False, cascade="all, delete-orphan")
    }
)

# mapper for KeywordAssociation
# specify "primary key" columns manually
mapper(KeywordAssociation, itemkeywords_table,
    primary_key=[itemkeywords_table.c.article_id, itemkeywords_table.c.keyword_id],
    properties={
        'keyword' : relation(Keyword, lazy=False), 
        'user' : relation(User, lazy=False) 
    }
)

# user mapper
mapper(User, users_table)

# keyword mapper
mapper(Keyword, keywords_table)

session = create_session(bind_to=engine)
# select by keyword
sqlalist = session.query(Article).select_by(keyword_name='jacks_stories')

# user is available
for a in alist:
    for k in a.keywords:
        if k.keyword.name == 'jacks_stories':
            print k.user.user_name

Keep in mind that the association object works a little differently from a plain many-to-many relationship. Members have to be added to the list via instances of the association object, which in turn point to the associated object:

user = User()
user.user_name = 'some user'

article = Article()

assoc = KeywordAssociation()
assoc.keyword = Keyword('blue')
assoc.user = user

assoc2 = KeywordAssociation()
assoc2.keyword = Keyword('green')
assoc2.user = user

article.keywords.append(assoc)
article.keywords.append(assoc2)

session.save(article)

session.flush()

SQLAlchemy includes an extension module which can be used in some cases to decrease the explicitness of the association object pattern; this extension is described in associationproxy.