Blackfire Player is a powerful Web Crawling, Web Testing, and Web Scraper application. It provides a nice DSL to crawl HTTP services, assert responses, and extract data from HTML/XML/JSON responses.
Some Blackfire Player use cases:
Blackfire Player executes scenarios written in a special DSL (files should end with .bkf
).
Use blackfire-player
with Docker.
Working directory is expected to be at /app
in the container.
Example running a scenario located in my-scenario.bkf
file:
1 | docker run --rm -it -e BLACKFIRE_CLIENT_ID -e BLACKFIRE_CLIENT_TOKEN -v "`pwd`:/app" blackfire/player run my-scenario.bkf |
Note
BLACKFIRE_CLIENT_ID
and BLACKFIRE_CLIENT_TOKEN
environment variables
need to be properly exposed from the host in order to be able to use the
Blackfire Profiler integration.
You may also add a shell alias (in .bashrc
, .zshrc
, etc.) for convenience:
1 | alias blackfire-player="docker run --rm -it -e BLACKFIRE_CLIENT_ID -e BLACKFIRE_CLIENT_TOKEN -v \"`pwd`:/app\" blackfire/player" |
Don’t forget to restart your terminal for it to take effect. You can then use
blackfire-player
as if it was the binary itself:
1 2 3 | blackfire-player --version blackfire-player list blackfire-player run my-scenario.bkf |
Use the run
command to execute a scenario file:
1 | blackfire-player run scenario.bkf |
Note
The file argument may be omitted when using the standard input:
1 | cat scenario.bkf | blackfire-player run |
You can also run scenarios contained in a .blackfire.yaml
file:
1 | blackfire-player run .blackfire.yaml |
Use the --endpoint
option to override the endpoint defined in the scenario file:
1 | blackfire-player run scenario.bkf --endpoint=http://example.com/ |
Use the --json
option to output a JSON report:
1 | blackfire-player run scenario.bkf --json |
Use the --variable
option to override variable values:
1 | blackfire-player run scenario.bkf --variable "foo=bar" --variable "bar=foo" |
Use the --concurrency
option to run scenarios in parallel (experimental):
1 | blackfire-player run scenario.bkf --concurrency=5 |
Use -v
to get logs about the progress of the player or use tracer
option
to store all requests and responses on disk.
The command uses the following exit codes in case of failure:
64
if at least one scenario fails;65
if a fatal error occurs, preventing the build to play correctly;66
if a non-fatal error occurs;The validate
command checks if passed scenario file is valid:
1 | blackfire-player validate scenario.bkf |
The file argument may be omitted when using the standard input:
1 | cat scenario.bkf | blackfire-player validate |
Note
It is not possible to validate a scenario contained in a .blackfire.yaml
file.
Use the --json
option to output a JSON report:
1 | blackfire-player validate scenario.bkf --json |
The command uses the following exit codes in case of failure:
64
if the file is invalid;Blackfire Player lets you crawl an application thanks to descriptive scenarios written in a domain specific language:
1 2 3 4 5 6 7 8 9 10 11 | name "A build made of scenario" # Default endpoint # Can be override with option "--endpoint=http://newendpoint.com" endpoint "http://example.com/" scenario name "Scenario Name" visit url('/') expect status_code() == 200 |
This example shows how to make a request on an HTTP application
(http://example.com/
) and be sure that it behaves the way you expect it to
by Writing Expectations (the status code of the response is 200).
Store the scenario in a scenario.bkf
, and run it:
1 2 3 4 | blackfire-player run scenario.bkf # or php blackfire-player run scenario.bkf |
Add more requests to a scenario by indenting lines as below:
1 2 3 4 5 6 | scenario visit url('/') expect status_code() == 200 visit url('/blog/') expect status_code() == 200 |
Note
The line indentation defines the structure like for Python scripts or YAML
files. Validate bkf
files with the validate
command:
blackfire-player validate scenario.bkf
.
A scenario is a sequence of HTTP calls (steps) that share the HTTP session and cookies. Scenario definitions are declarative, the order of settings (like expectations) within a “step” does not matter.
Instead of making discrete requests like above, you can also interact with the HTTP response if the content type is HTML by clicking on links, submitting forms, or follow redirections (see Making requests for more information):
1 2 3 4 5 6 | scenario visit url('/') expect status_code() == 200 click link('Read more') expect status_code() == 200 |
Note
If your scenario does not work as expected, use -v
to get a more
verbose output.
Tip
You can add comments in a scenario file by prefixing the line with #
:
1 2 3 4 5 | # This is a comment scenario # Comment are ignored visit url('/') expect status_code() == 200 |
There are several ways you can jump from one HTTP request to the next.
visit
¶visit
goes directly to the referenced HTTP URL (defaults to the GET
HTTP method unless you define one explicitly):
1 2 3 | scenario visit url('/') method 'POST' |
You can also pass a Request body:
1 2 3 4 | scenario visit url('/') method 'PUT' body '{ "title": "New Title" }' |
Tip
An expression can be written on several lines with the following syntax:
1 2 3 4 5 6 7 8 9 10 | scenario visit url('/login') method 'POST' body """ { "user": "john", "password": "doe" } """ |
Starting from version v1.11.0 you can also use variables by adding i
option to multiline string.
1 2 3 4 5 6 7 8 9 10 11 12 | scenario visit url('/login') method 'POST' set username "john" set password "doe" body """i { "user": "${username}", "password": "${password}" } """ |
click
¶click
clicks on a link in an HTML page (takes an expression as an argument):
1 2 | scenario click link("Add a blog post") |
submit
¶submit
submits a form in an HTML page (takes an expression as an argument);
parameters to submit with the form are defined via param
entries:
1 2 3 4 5 6 7 8 9 | scenario submit button("Submit") param title 'Happy Scraping' param content 'Scraping with Blackfire Player is so easy!' # File Upload: # the path is relative to the current .bkf file # the name parameter is optional param image file('relative/path/to/image.png', 'blackfire.png') |
Values can also be randomly generated via the fake()
function:
1 2 3 4 | scenario submit button("Submit") param title fake('sentence', 5) param content join(fake('paragraphs', 3), "\n\n") |
Generate random images with the simple_image
generator:
1 2 3 | scenario submit button("Submit") param image file(fake('simple_image', null, 400, 300, 'png', true, true), 'placeholder.png') |
Note
fake()
use the Faker library
under the hood.
Read the simple_image
generator documentation for more information about
its arguments.
HTTP redirections are never followed automatically to let you write expectations and assertions on redirect responses:
1 2 3 4 | scenario visit "redirect.php" expect status_code() == 302 expect header('Location') == '/redirected.php' |
Use follow
to follow one redirection:
1 2 3 4 5 6 7 | scenario visit "redirect.php" expect status_code() == 302 expect header('Location') == '/redirected.php' follow expect status_code() == 200 |
follow_redirects
switches the player to automatically follow all
redirections:
1 2 | scenario follow_redirects true |
or:
1 2 3 | scenario visit "redirect.php" follow_redirects |
Please note that when using follow_redirects
, expectations (expect
)
and assertions (assert
) are checked on the redirecting response
(so, before the redirection).
Use a follow
step if you need to check them after the redirection.
include
¶include
allows to embed some repetitive steps into several scenarios to
avoid copy/pasting the same code over and over again:
In a groups.bkf
file, write a group
that contains the logic to log in:
1 2 3 4 5 6 7 | group login visit url('/login') expect status_code() == 200 submit button('Login') param user 'admin' param password 'admin' |
Then, in another file, load
the group
and include
it when you need
it:
1 2 3 4 5 6 7 8 9 | load "groups.bkf" scenario name "Scenario Name" include login visit url('/admin') expect status_code() == 200 |
Each step can be configured via the following options.
header
¶header
sets a header:
1 2 3 | scenario visit url('/') header "Accept-Language: en-US" |
Tip
Simulate a specific browser is as simple as overriding the default
User-Agent
and using fake()
:
1 2 3 | scenario visit url('/') header 'User-Agent: ' ~ fake('firefox') |
auth
¶auth
sets the Authorization
header:
1 2 3 | scenario visit url('/') auth "username:password" |
wait
¶wait
adds a delay in milliseconds after sending the request:
1 2 3 | scenario visit url('/') wait 10000 |
The wait
value can be any valid expression; get a random delay by using
fake()
:
1 2 3 | scenario visit url('/') wait fake('numberBetween', 1000, 3000) |
json
¶json
configures the Request to upload JSON encoded data as the body:
1 2 3 4 5 | scenario visit url('/') method 'POST' param foo "bar" json true |
You can also set some of these options for all steps of a scenario:
1 2 3 | scenario auth "username:password" header "Accept-Language: en-US" |
… which can be disabled on any given step by setting the value to false
:
1 2 3 4 | scenario visit url('/') header "Accept-Language: false" auth false |
Expectations are expressions evaluated against the current HTTP response and if one of them returns a falsy value, Blackfire Player stops the run and generates an error.
Expressions have access to the following functions:
current_url()
: Returns the current URLstatus_code()
: The HTTP status code for the current HTTP response;header()
: Returns the value of an HTTP header;body()
: The HTTP body for the current HTTP response;trim()
: Strip whitespace from the beginning and end of a string;unique()
: Removes duplicate values from an array;join()
: Join array elements with a string;merge()
: Merge one or more arrays;regex()
: Perform a regular expression match;css()
: Returns nodes matching the CSS selector (for HTML responses);xpath()
: Returns nodes matching the XPath selector (for HTML and XML
responses);json()
: Returns JSON elements (from the request) matching the CSS expression.transform()
: Returns JSON elements matching the CSS expression.The css()
and xpath()
functions return
Symfony\Component\DomCrawler\Crawler
instances. Learn more about methods
you can call on Crawler instances; the json()
function returns a PHP array.
The json()
function accepts JMESPath.
The result of calling functions can be checked via operators described.
Note
Learn more about Expressions syntax in the Symfony documentation.
Here are some expression examples:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | # return all HTML nodes matching ".post h2 a" css(".post h2 a") # return the text of the first node matching ".post h2 a" css(".post h2 a").first().text() # return the href attribute of the first node matching ".post h2 a" css(".post h2 a").first().attr("href") # check that "h1" contains "Welcome" css("h1:contains('Welcome')").count() > 0 # same as above css("h1").first().text() matches "/Welcome/" # return the Age request HTTP header header("Age") # check that the HTML body contains "Welcome" body() matches "/Welcome/" # get a value json("_links.store.href") # get keys json("arguments."sql.pdo.queries".keys(@)") |
Variables can be defined to make your scenarios dynamic. Use set
to define
the default value:
1 2 3 4 5 6 7 | scenario name "HTTP Cache" set env "dev" when "prod" == env visit url('/') # check HTTP cache, but only on production |
And override it with the --variable
option on the CLI:
1 | blackfire-player run scenario.bkf --variable env=prod |
Use with
to iterate over a set of data:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | scenario name "HTTP Cache" set paths ["/", "/blog/"] with path in paths visit url(path) name "Checking performance on path: " ~ path expect status_code() == 200 # performance checks scenario name "Checks on key pages" with name, data in \ { \ admin: { slug: "/admin/", expectedStatusCode: 401 }, \ products: { slug: "/products/", expectedStatusCode: 200 }, \ about: { slug: "/about/", expectedStatusCode: 200 } \ } visit url(data["slug"]) name "Checking performance on path: " ~ name expect status_code() == data["expectedStatusCode"] |
Use while
to perform loops:
1 2 3 4 5 6 7 8 9 10 11 | scenario name "While loops" visit url('/products/') set pageCount css(".max_results_count").first().text() set page 1 while page < pageCount visit url('/products/?page=' ~ page) set page page + 1 expect status_code() == 200 # performance checks |
To run scenarios defined in several files, you can use load
instead of
listing all the files as arguments to the player:
1 2 3 4 5 | # load and execute all scenarios from files in this directory load "*.bkf" # load and execute all scenarios from files in all sub-directories load "**/*.bkf" |
Blackfire Player integrates seamlessly with Blackfire Profiler. Read out the dedicated documentation to learn more about Blackfire Profiler integration.
When crawling an HTTP application, you can extract values from HTTP responses:
1 2 3 4 5 6 7 8 9 | scenario visit url('/') expect status_code() == 200 set latest_post_title css(".post h2").first() set latest_post_href css(".post h2 a").first().attr("href") set latest_posts css(".post h2 a").extract('_text', 'href') set age header("Age") set content_type header("Content-Type") set token regex('/name="_token" value="([^"]+)"/') |
set
takes two arguments:
Using json()
, css()
, and xpath()
on JSON, HTML, and XML responses
is recommended, but for pure text responses or complex values, you can use the
generic regex()
function.
Note
regex()
takes a regex as an argument and always returns the first
captured parenthesized subpattern. Note that backslashes must be escaped by
doubling them: "/\\.git/"
.
The values are also available at the end of a crawling session:
1 2 | # use --json to display a report including variable values blackfire-player run scenario.bkf --json |
Variable values can also be injected before running another scenario:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | scenario name "Scenario name" auth api_username ~ ':' ~ api_password set profile_uuid 'zzzz' visit url('/profiles' ~ profile_uuid) expect status_code() == 200 set sql_queries json('arguments."sql.pdo.queries".keys(@)') set store_url json("_links.store.href") visit url(store_url) method 'POST' body '{ "foo": "batman" }' expect status_code() == 200 |