Problem
The API for working with existing database schemas (when Python classes aren't available) is confusing and inconsistent:
VirtualModule and create_virtual_module are identical (alias) - confusing
spawn_missing_classes uses unusual "spawn" terminology
gc.py references schema.spawn_table() which doesn't exist
- No simple way to get a single table class without creating a whole module
- Context/namespace manipulation is confusing for users
Proposed Changes
1. Add Schema.__getitem__ for direct table access
table_class = schema['TableName']
2. Add Schema.__iter__ for iteration
for table_class in schema:
print(table_class.full_table_name)
3. Add Schema.get_table(name) method
Returns a single table class. Also fixes the missing method that gc.py needs.
4. Rename spawn_missing_classes to generate_classes
Clearer terminology ("generate" vs "spawn").
5. Add dj.virtual_schema() function
Cleaner alternative to VirtualModule:
lab = dj.virtual_schema('my_lab_schema')
lab.Subject.fetch()
6. Deprecate create_virtual_module
Keep VirtualModule, deprecate the redundant alias.
Files to Modify
src/datajoint/schemas.py - Main implementation
src/datajoint/__init__.py - Exports
src/datajoint/gc.py - Update to use get_table()
tests/integration/test_virtual_module.py - Add tests
Example Usage After Changes
# Current (confusing)
module = dj.VirtualModule('lab', 'my_lab_schema')
module.Subject.fetch()
# Or
schema = dj.Schema('my_lab_schema')
schema.spawn_missing_classes(context=locals())
Subject.fetch()
# Proposed (cleaner)
schema = dj.Schema('my_lab_schema')
Subject = schema['Subject'] # get one table
Subject.fetch()
# Or iterate
for table in schema:
print(len(table()))
# Or use virtual_schema for module-like access
lab = dj.virtual_schema('my_lab_schema')
lab.Subject.fetch()