Fuzzing with Grammarinator and Swagger Petstore API Tutorial
Grammarinator is an ANTLR v4 grammar-based test generator.
In this tutorial we are going to use it to generate tests based on the Swagger Petstore API.
How to write an ANTRL grammar from a REST API
To run the Swagger Petstore API locally, we are going to use docker:
docker run --name swaggerapi-petstore3 -d -p 8080:8080 swaggerapi/petstore3:unstable
By executing the command above, you should be able to inspect the API using an interactive UI available at http://localhost:8080
There you should have access to all available paths and schemas. Now let’s define some rules for our ANTLR v4 grammar based on some of those requests.
GET /pet/{petId} — Find Pet by ID
Using the interactive UI, it’s possible to generate and even execute HTTP calls using curl:
# interactive UI output
curl -X GET "http://localhost:8080/api/v3/pet/1" -H "accept: application/json"
From the output generated by the interactive UI, we can see that the only part that can change is the petId. Therefore we can derive a grammar by considering the petId as a number that can change:
grammar get
;start
: 'curl -X GET "http://localhost:8080/api/v3/pet/' integerLiteral '" -H "accept: application/json"'
;integerLiteral: INTEGER_LITERAL;INTEGER_LITERAL: '0' | [1-9][0-9]*;
POST /pet/{petId} — Post Pet by Id
# interactive UI output
curl -X POST “http://localhost:8080/api/v3/pet/1?name=Beethoven&status=available" -H “accept: */*” -d “”
From the output, we can identify three variables: the pet ID, 1; the pet name, Beethoven; and the pet status, available. Therefore we end up with this grammar:
grammar post
;start
: 'curl -X POST "http://localhost:8080/api/v3/pet/' integerLiteral '?name=' text '&status=' status '" -H "accept: */*" -d ""'
;integerLiteral: INTEGER_LITERAL;
text: TEXT;
status: 'available' | 'pending' | 'sold';TEXT: [_a-zA-Z0-9]+;
INTEGER_LITERAL: '0' | [1-9][0-9]*;
PUT /pet — Put Pet
# interactive UI output
curl -X PUT "http://localhost:8080/api/v3/pet" -H "accept: application/json" -H "Content-Type: application/json" -d "{\"id\":10,\"name\":\"Roger\",\"category\":{\"id\":1,\"name\":\"Dogs\"},\"photoUrls\":[\"string\"],\"tags\":[{\"id\":0,\"name\":\"string\"}],\"status\":\"available\"}"
From the output we can identify 6 variables: the petId; the pet name; the category, with id and name; photoUrls, a list of urls; tags, a list of tags (each tag being: id and name); and the status. With all these variables we end up with the following grammar:
grammar put
;start
:'curl -X PUT "http://localhost:8080/api/v3/pet" -H "accept: application/json" -H "Content-Type: application/json" -d "{\\"id\\":' integerLiteral ',\\"name\\":\\"' text '\\",\\"category\\":{\\"id\\":' integerLiteral ',\\"name\\":\\"'w text '\\"},\\"photoUrls\\":[\\"' listOfText '\\"],\\"tags\\":[' listOfTags '],\\"status\\":\\"' status '\\"}"'
;listOfText: TEXT ( '\\" , \\"' TEXT)*;
text: TEXT;
listOfTags: tag ( ',' tag)*;
tag: '{\\"id\\":' integerLiteral ',\\"name\\":\\"' text '\\"}';
integerLiteral: INTEGER_LITERAL;
status: 'available' | 'pending' | 'sold';TEXT: [_a-zA-Z0-9]+;
INTEGER_LITERAL: '0' | [1-9][0-9]*;
DELETE /pet/{petId} — Delete Pet by ID
# interactive UI output
curl -X DELETE "http://localhost:8080/api/v3/pet/1" -H "accept: */*" -H "api_key: any_key"
From the output, we can see that it’s really similar to the GET example since it has a pet ID variable. The only difference is the extra field api_key that can be any kind of value. We can derive the following grammar:
grammar delete
; start
: 'curl -X DELETE "http://localhost:8080/api/v3/pet/' integerLiteral '" -H "accept: */*" -H "api_key: ' any '"'
;integerLiteral: INTEGER_LITERAL;
any: ANY+;INTEGER_LITERAL: '0' | [1-9][0-9]*;
ANY: .;
Result
Merging all grammars we end up with SwaggerPetstore.g4 grammar:
grammar SwaggerPetstore
;start: get | post | put | delete;get
: 'curl -X GET "http://localhost:8080/api/v3/pet/' integerLiteral '" -H "accept: application/json"'
;post
: 'curl -X POST "http://localhost:8080/api/v3/pet/' integerLiteral '?name=' text '&status=' status '" -H "accept: */*" -d ""'
;put
: 'curl -X PUT "http://localhost:8080/api/v3/pet" -H "accept: application/json" -H "Content-Type: application/json" -d "{\\"id\\":' integerLiteral ',\\"name\\":\\"' text '\\",\\"category\\":{\\"id\\":' integerLiteral ',\\"name\\":\\"' text '\\"},\\"photoUrls\\":[\\"' listOfText '\\"],\\"tags\\":[' listOfTags '],\\"status\\":\\"' status '\\"}"'
;delete
: 'curl -X DELETE "http://localhost:8080/api/v3/pet/' integerLiteral '" -H "accept: */*" -H "api_key: ' any '"'
;listOfText: TEXT ( '\\" , \\"' TEXT)*;
listOfTags: tag ( ',' tag)*;
tag: '{\\"id\\":' integerLiteral ',\\"name\\":\\"' text '\\"}';integerLiteral: INTEGER_LITERAL;
text: TEXT;
any: ANY+;
status: 'available' | 'pending' | 'sold';INTEGER_LITERAL: '0' | [1-9][0-9]*;
TEXT: [_a-zA-Z0-9]+;
ANY: .;
How to use Grammarinator with ANTRL grammar
Now that we have our SwaggerPetstore grammar defined, we can use it to generate tests using the Grammarinator.
First you are going to need to install it by running:
pip3 install grammarinator
Then start by running the processor:
grammarinator-process ./SwaggerPetstore.g4 -o ./out --no-actions
Then, use the generator to produce test cases:
grammarinator-generate -p ./out/SwaggerPetstoreUnparser.py -l ./out/SwaggerPetstoreUnlexer.py -o ./out/gen/ -n 100
This command generates 100 executable test cases in the folder ./out/gen
—e.g.:
#1
curl -X GET "http://localhost:8080/api/v3/pet/1" -H "accept: application/json"#2
curl -X POST "http://localhost:8080/api/v3/pet/7?name=q&status=pending" -H "accept: */*" -d "#3
curl -X PUT "http://localhost:8080/api/v3/pet" -H "accept: application/json" -H "Content-Type: application/json" -d "{\"id\":7,\"name\":\"zS\",\"category\":{\"id\":70,\"name\":\"v6\"},\"photoUrls\":[\"M1\"],\"tags\":[{\"id\":0,\"name\":\"U\"}],\"status\":\"pending\"}"#4
curl -X DELETE "http://localhost:8080/api/v3/pet/9" -H "accept: */*" -H "api_key: Q{V]"
While running the Swagger Petstore API locally, you can test all generated commands on your shell.
Conclusion
In general, it’s a very good tool to generate random test cases. The complexity of the results depend on the grammar used as input, and, since it uses an awesome tool such as ANTLR v4, it can reach high levels of complexity!