Avro Schema – Sort Order

Avro defines a sort order for the objects. For most of the types it is natural ordering (for example ascending in case of int) . All the types have a defined rules except for record type. In case of record type one can define the order by adding order attribute to the field. The order attribute takes one of the following three values.

Order attribute :

ascending -default
descending
ignore – The filed will not be considered for comparison.If your map reduce keys are based on avro schema, you can define which fields are considered for key comparison.

For example you want to consider only employee id in sorting and ignore remaining fields, the schema would look like the following.


{
 "name":"Employee",
 "type":"record",
 "doc":"employee records",
 "fields":[{
"name":"empId",
 "type":"string",
 "order":"descending"
 },{
"name":"empName",
 "type":"string",
 "order":"ignore"
}]
}

Avro implements binary comparison, which means it will not deserialization the object to do comparison.

check this blog for example program.

Add a Comment

Your email address will not be published. Required fields are marked *