Advertisement
grapo

Django GSOC 2012

Mar 20th, 2012
238
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 10.95 KB | None | 0 0
  1. --------
  2. GSOC 2012 Customizable serialization
  3. -------
  4.  
  5. Django has a framework for serialization but it is simple tool. The main problem is impossibility to define own serialization structure and no support for related models. Below I present a proposal to improve current framework.
  6. In my opinion it is not possible to create a framework completely independent of formats that will be user to serialize objects. For instance XML has richer syntax than json (e.g. fields can be tags or attributes) so we must provide functions to handle it which wont be useful in JSON serialization.
  7.  
  8. -------
  9. Features to implement:
  10. -------
  11.  
  12. Based on presented issues to consider, GSOC proposal from last years and django-developers group threads I prepare a list of features that good solution should have.
  13.  
  14. 1. Defining the structure of serialized object
  15. 1.1. Object fields can be at any position in output tree.
  16. 1.2. Renaming fields
  17. 1.3. Serializing non-database attributes/properties
  18. 1.4. Serializing any subset of object fields.
  19. 2. Defining own fields
  20. 2.1. Related model fields
  21. 2.1.1. Serializing foreign keys, m2m and reverse relations
  22. 2.1.2. Choose depth of serialization
  23. 2.1.3. Handling natural keys
  24. 2.1.4. Handling objects serialized before (in other location of output tree)
  25. 2.1.5. Object of same type can be differently handled depends on location
  26. 2.2. Other fields - custom serialization (e.g. only date in datetime fields)
  27. 3. One definition can support multiple serialization formats (XML, JSON, YAML)
  28. 4. Backward compatible
  29. 5. Solution should be simple. Easy to write own serialization scheme.
  30.  
  31. Below I have tags like (F2.1.2) - means support for feature 2.1.2.
  32.  
  33. ------
  34. Concept:
  35. ------
  36.  
  37. Output structure will be defined declarative using classes. For sure there is needed class for model definition. In my solution I define also model fields with classes. It's the simplest way to provide free output structure.
  38.  
  39. Suppose we want to serialize this model:
  40.  
  41. class Comment(Model):
  42. user = ForeignKey(Profile)
  43. photo = ForeignKey(Photo)
  44. topic = CharField()
  45. content = CharField()
  46. created_at = DateTimeField()
  47. ip_address = IPAddressField()
  48.  
  49. class User(Model):
  50. fname = CharField()
  51. lname = CharField()
  52.  
  53. class Photo(Model):
  54. sender = ForeignKey(User)
  55. image = ImageField()
  56.  
  57.  
  58. Below we have definition of serializer classes CommentSerializer.
  59.  
  60. If we want to serialize comment queryset:
  61. serializers.serialize('json/xml/yaml', queryset, serializer=CommentSerializer, **options)
  62. If 'serializer' isn't provided we have defaults serializer for each format (F3)
  63.  
  64. class CommentSerializer(BaseModelSerializer):
  65.  
  66. @attribute
  67. def content(self, obj):
  68. return obj.content.truncate(20)
  69.  
  70. def photo(self, obj):
  71. return CommentSerializer.serialize(obj, PhotoSerializer) #(F2.1.5)
  72.  
  73. def topic(self, obj):
  74. return ModelSerializer.serialize(obj, TopicFieldSerializer)
  75.  
  76. def x(self, obj): #(F1.3)
  77. return 5
  78.  
  79. def y(self, obj):
  80. return 10
  81.  
  82. def unicode__datetime(self, obj): #(F2.2)
  83. return smart_unicode(obj.date())
  84.  
  85. class Meta:
  86. aliases = {'topic' : 'subject'}
  87. #fields = (,)
  88. exclude = ('ip_address',)
  89. fk_level = 1
  90. fk_natural_keys = False|True
  91. fk_reserialize = False|True
  92. default_field_serializer = FieldSerializer
  93. default_related_serializer = NestedRelationSerializer # subclass of BaseModelSerializer or BaseFieldSerializer
  94. model_name = "my_obj"
  95. structure = "__fields pk__field special[topic__field x__field] special2{created_at y__field}"
  96.  
  97. Class has only methods and definition of Meta class. Default each field is serialized by Meta.default_field_serializer or Meta.default_related_serializer. Class methods redefining this behavior. unicode__xxx method redefining serialization for type xxx
  98.  
  99. Methods can return:
  100. 1. Base type like str, int, float etc.
  101. 2. Object subclassing BaseFieldSerializer or BaseModelSerializer
  102. 3. List of 1,2
  103.  
  104. Meta Class
  105. a) aliases - redefine field name: topic : "..." => subject : "...". Can do 'topic' : '' - return of topic method is one level higher. There is metatag __fields - rename all fields. #(F1.2)
  106. b) fields - fields to serialize #(F1.4)
  107. c) exclude - fields to not serialize #(F1.4)
  108. e) fk_level - depth of related models serialization. If 0 then use flat serialization else nested. Each nested model has level-=1 #(F2.1.2)
  109. f) fk_natural_keys - use natural_keys instead of pk #(F2.1.3)
  110. g) fk_reselialize - serialize previous serialized model if nested? If True then yes else fallback to flat #(F2.1.4)
  111. h) default_field_serializer - default field serializer
  112. h) default_related_serializer - default realted serializer #(F2.1.1)
  113. i) model_name - if it isn't empty returns <model_name_value>serialized object</model_name_value> else return serialized object. Useful with nested serialization.
  114. j) structure - I don't like it :/ Declaration of structure. It's some shortcut for declaring fields (in methods way). #(F1.1)
  115. __fields - all fields and declared methods (parametrized by Meta.fields and Meta.exclude)
  116. xxx__field where xxx is model field or defined method. xxx_field won't be in __fields
  117. xxx[x,y,z] - For xml: <xxx><x_name>x_value</x_name></xxx><xxx>... For json: xxx : [x,y,z]
  118. xxx{x,y,z} - For xml: <xxx><x_name>x_value</x_name><y_name>... For json xxx : { x_name : x_value ... }
  119.  
  120.  
  121. class TopicFieldSerializer(BaseFieldSerializer):
  122.  
  123. def lower_topic(self, name, obj):
  124. return obj.topic.lower()
  125.  
  126. def __name__(self, name, obj):
  127. return "value"
  128.  
  129. def __value__(self, name, obj):
  130. return getattr(obj, name)
  131.  
  132. class Meta:
  133. structure="__fields additional[]"
  134.  
  135. Field serializer has two special methods __name__ and __value__. __value__ is the primary value returned by field.
  136. E.g.
  137. In some model class (topic="Django")
  138. def topic(self, obj):
  139. return Model.serialize(obj, FieldSerializer)
  140. xml: <topic>Django</topic>
  141. json "topic" : "Django"
  142.  
  143. But what if we want to add come custom attribute (like lower_topic above).
  144. xml: <topic><lower_topic>django</lower_topic>Django</topic> - far i know it's correct but it's what we want?
  145. json topic : {lower_topic : django, ??? : Django}
  146. We have __name__ to provide some name for field:
  147. def __name__(self, obj):
  148. return "value"
  149. xml: <topic><lower_topic>django</lower_topic><value>Django</value></topic>
  150. json topic : {lower_topic : django, value : Django}
  151.  
  152.  
  153. structure in Meta of FieldSerializer is like in ModelSerializer
  154.  
  155. class PhotoSerializer(BaseModelSerializer):
  156. @attribute
  157. def image(self, obj):
  158. return obj.image.url
  159.  
  160. class Meta:
  161. model_name = ""
  162. structure = "__fields"
  163.  
  164.  
  165. What we got:
  166.  
  167. XML
  168. <my_obj content="Some truncated conte">
  169. <user>
  170. <fname>Piotr</fname>
  171. <lname>Grabowski</fname>
  172. </user>
  173. <photo image="/images/1.jpg">
  174. <sender>1</sender>
  175. </photo>
  176. <pk>10</pk>
  177. <special>
  178. <subject>
  179. <lower_topic>extra topic</lover_topic>
  180. <value>EXTRA topic</value>
  181. <additional></additional>
  182. </subject>
  183. </special>
  184. <special>
  185. <x>5</x>
  186. </special>
  187. <special2>
  188. <created_at>2011-03-20</created_at>
  189. <y>10</y>
  190. </special2>
  191. </my_obj>
  192.  
  193. JSON
  194. {
  195. "user" : {
  196. "fname" : "Piotr",
  197. "lname" : "Grabowski"
  198. },
  199. "photo" : {
  200. "image" : "/images/1.jpg",
  201. "sender" : 1
  202. },
  203. "pk" : 10,
  204. "special" : [
  205. { subject : { "lower_topic" : ..., ... },
  206. {"x" : 5}
  207. ],
  208. "special2" {
  209. "created_at" : "...",
  210. "y" : 10
  211. }
  212. }
  213.  
  214. -----
  215. Prove of concept
  216. -----
  217.  
  218. class JSONSerializer(BaseModelSerializer):
  219.  
  220. def pk(self, obj):
  221. return smart_unicode(obj._get_pk_val(), strings_only=True)
  222.  
  223. def model(self, obj):
  224. return smart_unicode(obj._meta)
  225.  
  226. class Meta:
  227. fk_level=0
  228. fk_natural_keys = False
  229. default_field_serializer = FieldSerializer
  230. structure = "pk__field model__field fields{__fields}"
  231.  
  232.  
  233. class XMLSerializer(BaseModelSerializer):
  234. @attribute
  235. def pk(self, obj):
  236. return smart_unicode(obj._get_pk_val(), strings_only=True)
  237.  
  238. @attribute
  239. def model(self, obj):
  240. return smart_unicode(obj._meta)
  241.  
  242. class Meta:
  243. aliases = {'__fields' : 'field'}
  244. fk_level=0
  245. fk_natural_keys = False
  246. default_field_serializer = XMLFieldSerializer
  247. default_relation_serializer = XMLFlatRelationSerializer
  248. model_name="object"
  249. structure = "pk__field model__field __fields"
  250.  
  251.  
  252. XMLFieldSerializer(BaseFieldSerializer):
  253. @attribute
  254. def name(self, name, obj):
  255. ...
  256.  
  257. @attribute
  258. def type(self, name, obj):
  259. ...
  260. XMLFlatRelationSerializer(BaseFieldSerializer):
  261. @attribute
  262. def to
  263. ...
  264.  
  265. @attribute
  266. def name
  267. ...
  268.  
  269. @attribute
  270. def rel
  271. ...
  272. -----
  273. Supported features:
  274. -----
  275.  
  276. 1. Defining the structure of serialized object
  277. 1.1. Object fields can be at any position in output tree. YES
  278. 1.2. Renaming fields. YES
  279. 1.3. Serializing non-database attributes/properties. YES
  280. 1.4. Serializing any subset of object fields. YES
  281. 2. Defining own fields
  282. 2.1. Related model fields
  283. 2.1.1. Serializing foreign keys, m2m and reverse relations PARTIAL - FK OUT OF BOX, M2M AND REVERSE RELATION CAN BE MANUAL CODED
  284. 2.1.2. Choose depth of serialization YES
  285. 2.1.3. Handling natural keys YES
  286. 2.1.4. Handling objects serialized before (in other location of output tree) YES
  287. 2.1.5. Object of same type can be differently handled depends on location YES
  288. 2.2. Other fields - custom serialization (e.g. only date in datetime fields) YES
  289. 3. One definition can support multiple serialization formats (XML, JSON, YAML) DEPENDS OF DEFINITION - YES IN MOST CASES
  290. 4. Solution should be simple. Easy to write own serialization scheme. ???
  291.  
  292. -----
  293. What I should consider now:
  294. -----
  295. a) 2.1.1 - Handling of m2m fields - handling intermediate models
  296. b) What object should be passed to FieldSerialization
  297. c) Serialization of heterogeneous lists - it's possible now but needs to manual coding class.
  298. d) 4. It is really simple?
  299. e) 3. Can one definition support any format?
  300. f) Think about YAML - something betwen JSON and XML. Can solution be incompatible with it?
  301.  
  302.  
  303. -----
  304. Shedule
  305. -----
  306. TODO
  307.  
  308. -----
  309. About
  310. -----
  311. My name is Piotr Grabowski. I'm last year student at the Institute of Computer Science University of Wrocław (Poland). I've been working with Django for 2 years. Python is my preffered programing language but I have been using also Ruby(&Rails) and JavaScript.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement