Automated Metrics Documentation Generator¶
Overview¶
This documentation describes the automated scripting process used to extract and document InfluxDB metrics from the Brekz PrestaShop codebase. The script was instrumental in building the comprehensive metrics tables found in the add-checkout-metrics branch.
What This Script Does¶
The metrics documentation generator performs the following tasks:
- Searches the codebase for all
InfluxDb::writeMeasurement()calls using ripgrep with multiline pattern matching - Parses measurement data including measurement names, fields, and tags from the PHP code
- Generates GitHub permalinks to the exact line of code where each metric is tracked
- Formats output as Markdown tables ready to be pasted directly into the documentation
This automated approach ensures that: - All metrics are consistently documented - GitHub links point to the correct source code locations - The documentation format matches the project standards - No metrics are accidentally overlooked during manual documentation
Prerequisites¶
Before running the script, ensure you have the following installed:
- Ripgrep (
rg) - A fast recursive search tool (installation guide) - PHP CLI - Command-line PHP interpreter (version 7.4 or higher)
- Clipboard utility:
- macOS:
pbcopy(built-in) - Linux:
xcliporxsel - Windows:
clip(built-in) or WSL withxclip
- macOS:
How to Use¶
Step 1: Navigate to Your Codebase¶
cd /path/to/brekz-prestashop
Step 2: Create the PHP Script¶
Save the PHP script (shown in the PHP Script Source section below) as getInfluxMeasurementRowArray in a location accessible from your codebase. Make it executable:
chmod +x getInfluxMeasurementRowArray
Step 3: Run the Command¶
Execute the ripgrep command piped into the PHP script:
macOS:
rg -n --color=always -o 'InfluxDb::writeMeasurement(.|\n)*?\)\s*;' --multiline --with-filename | ./getInfluxMeasurementRowArray | pbcopy
Linux:
rg -n --color=always -o 'InfluxDb::writeMeasurement(.|\n)*?\)\s*;' --multiline --with-filename | ./getInfluxMeasurementRowArray | xclip -selection clipboard
Windows (WSL):
rg -n --color=always -o 'InfluxDb::writeMeasurement(.|\n)*?\)\s*;' --multiline --with-filename | ./getInfluxMeasurementRowArray | clip.exe
Step 4: Paste Into Documentation¶
The generated Markdown table rows are now in your clipboard. Paste them into your documentation file under the appropriate table header.
Command Breakdown¶
Let's break down the ripgrep command to understand what each part does:
rg -n --color=always -o 'InfluxDb::writeMeasurement(.|\n)*?\)\s*;' --multiline --with-filename
rg- Invokes ripgrep-n- Show line numbers--color=always- Preserve color codes in output (helpful for debugging)-o- Only show the matching part of lines'InfluxDb::writeMeasurement(.|\n)*?\)\s*;'- Regex pattern to match entire InfluxDB measurement calls spanning multiple lines--multiline- Enable matching across line breaks--with-filename- Include the filename and line number in output
The output is then piped (|) to the PHP script which parses and formats it.
Customizing for Your Project¶
Changing the GitHub Permalink¶
If you need to update the GitHub repository URL or commit hash, modify the getPermaLink() function in the PHP script:
function getPermaLink(string $fileLine): string
{
// Update this URL to match your repository and commit
$githubUrlPart = "https://github.com/your-org/your-repo/blob/COMMIT_HASH/";
$fileLines = explode(":", $fileLine, 2);
return '[Github Link](' . $githubUrlPart . $fileLines[0] . "#L" . ($fileLines[1] ?? '1') . ')';
}
Modifying the Table Format¶
To change the output table structure, edit the formatAsTableRows() function:
function formatAsTableRows(array $data): array
{
$result = [];
foreach ($data as $item) {
$result[] = sprintf(
'| <a id="MT-0" href="#MT-0"> MT-0 </a> | %s | Event | InfluxDB | %s | %s |',
$item['title'],
$item['permalink'],
$item['filters/tags']
);
}
return $result;
}
Searching Different Patterns¶
To search for different metric patterns, adjust the regex in the ripgrep command. For example, to find Google Analytics events:
rg -n --color=always -o 'ga\(.*?\);' --multiline --with-filename
Troubleshooting¶
No Output Generated¶
- Verify that ripgrep is installed:
rg --version - Check that you're in the correct directory with PHP files
- Test the ripgrep pattern alone first:
rg 'InfluxDb::writeMeasurement' --multiline
PHP Errors¶
- Ensure PHP CLI is available:
php --version - Check script permissions:
ls -l getInfluxMeasurementRowArray - Run the script directly with test input:
echo "test" | ./getInfluxMeasurementRowArray
Incomplete Matches¶
- Some measurements may be too complex for the regex pattern
- Check the error output (stderr) for parsing warnings
- Manually verify complex measurement calls
Example Output¶
The script generates Markdown table rows like this:
| <a id="MT-0" href="#MT-0"> MT-0 </a> | CartController: cart - add | Event | InfluxDB | [Github Link](https://github.com/brekz-group/brekz-prestashop/blob/3c9b697e7722eb52b8ed4c1981cc8091e7824013/web/override/controllers/front/CartController.php#L76) | - shop_id: (string)$this->context->shop->id <br> - shop_name: $this->context->shop->name <br> |
PHP Script Source¶
Below is the complete PHP script used to parse ripgrep output and generate documentation:
#!/usr/bin/env php
<?php
function errorPrint($message)
{
file_put_contents('php://stderr', "ERROR: " . $message . "\n");
}
function getPermaLink(string $fileLine): string
{
// Handle cases where fileLine might contain line numbers or fragments
if (strpos($fileLine, '#L') !== false) {
$fileLine = substr($fileLine, 0, strpos($fileLine, '#L'));
}
$githubUrlPart = "https://github.com/brekz-group/brekz-prestashop/blob/3c9b697e7722eb52b8ed4c1981cc8091e7824013/";
$fileLines = explode(":", $fileLine, 2);
return '[Github Link](' . $githubUrlPart . $fileLines[0] . "#L" . ($fileLines[1] ?? '1') . ')';
}
function getTitle(string $fileLine, array $influxMeasurementArray): string
{
// Extract filename from file path
$matches = [];
if (preg_match('/([^\/]+)\.php/', $fileLine, $matches)) {
$filename = $matches[1];
} else {
// Fallback for unknown files
$filename = 'unknown';
}
// Clean up filename if it contains #L
$filename = preg_replace('/#L\d+$/', '', $filename);
// Get the measurement name
$measurement = $influxMeasurementArray['measurement'];
$fieldString = '';
$fields = $influxMeasurementArray['fields'];
foreach ($fields as $fieldName => $fieldValue) {
$fieldString .= ' - ' . $fieldName;
}
$reasonString = '';
if (array_key_exists('reason', $influxMeasurementArray['tags'])) {
$reasonString = ' - reason: ' . $influxMeasurementArray['tags']['reason'];
}
// Don't duplicate field information in title
return $filename . ': ' . $measurement . $fieldString . $reasonString;
}
function getFiltersAndTags(array $influxMeasurementArray): string
{
$string = '';
foreach ($influxMeasurementArray['tags'] as $fieldName => $fieldValue) {
if (! empty($fieldValue)) {
$string .= ' - ' . $fieldName . ': ' . $fieldValue . ' <br> ';
}
}
// If no tags were found but we have fields, use the first field as a tag
if (empty($string) && ! empty($influxMeasurementArray['fields'])) {
$firstField = array_slice($influxMeasurementArray['fields'], 0, 1);
$string = ' - ' . key($firstField) . ': ' . current($firstField) . ' <br> ';
}
return $string;
}
function parseArrayString(string $str): array
{
// First, extract the measurement name
if (! preg_match("/'([^']+)'\s*,/s", $str, $measurementMatch)) {
errorPrint("Invalid measurement format: measurement name not found");
errorPrint("String being parsed: " . substr($str, 0, 500) . (strlen($str) > 500 ? '...' : ''));
throw new InvalidArgumentException("Invalid measurement format: measurement name not found");
}
$measurement = $measurementMatch[1];
// Initialize arrays
$fields = [];
$tags = [];
// Try to extract arrays from the string
$arrayStartPos = strpos($str, '[');
if ($arrayStartPos === false) {
errorPrint("Invalid array string format: no arrays found");
errorPrint("String being parsed: " . substr($str, 0, 500) . (strlen($str) > 500 ? '...' : ''));
return [
'measurement' => $measurement,
'fields' => $fields,
'tags' => $tags,
];
}
// Try to find the first array (fields)
$arrayEndPos = strpos($str, ']', $arrayStartPos);
if ($arrayEndPos !== false) {
$fieldsStr = substr($str, $arrayStartPos + 1, $arrayEndPos - $arrayStartPos - 1);
preg_match_all("/\s*'([^']+)'\s*=>\s*([^,]+)/", $fieldsStr, $matches, PREG_SET_ORDER);
foreach ($matches as $match) {
$key = $match[1];
$value = trim($match[2]);
if (! empty($value)) {
$fields[$key] = $value;
}
}
// Try to find a second array (tags)
$secondArrayStartPos = strpos($str, '[', $arrayEndPos);
if ($secondArrayStartPos !== false) {
$secondArrayEndPos = strpos($str, ']', $secondArrayStartPos);
if ($secondArrayEndPos !== false) {
$tagsStr = substr($str, $secondArrayStartPos + 1, $secondArrayEndPos - $secondArrayStartPos - 1);
preg_match_all("/\s*'([^']+)'\s*=>\s*([^,]+)/", $tagsStr, $matches, PREG_SET_ORDER);
foreach ($matches as $match) {
$key = $match[1];
$value = trim($match[2]);
if (! empty($value)) {
$tags[$key] = $value;
}
}
}
}
}
return [
'measurement' => $measurement,
'fields' => $fields,
'tags' => $tags,
];
}
function processMeasurementBlock(string $fileLine, string $measurementString): array
{
// Clean measurement string by removing anything after | or line breaks
$cleanString = preg_replace('/\s*\|\s*.*$/', '', $measurementString);
$cleanString = preg_replace('/\.\.\.$/', '', $cleanString);
// First, extract the measurement name
if (! preg_match("/'([^']+)'\s*,/s", $cleanString, $measurementMatch)) {
errorPrint("Invalid measurement format: measurement name not found");
errorPrint("Measurement content: " . substr($cleanString, 0, 500) . (strlen($cleanString) > 500 ? '...' : ''));
throw new InvalidArgumentException("Invalid measurement format: measurement name not found");
}
$measurement = $measurementMatch[1];
// Try to parse the arrays
try {
$measurementArray = parseArrayString($cleanString);
// If we have no fields but have the measurement name, create a default field
if (empty($measurementArray['fields']) && strpos($measurement, '_') !== false) {
$action = substr($measurement, strrpos($measurement, '_') + 1);
$measurementArray['fields'] = [$action => '1'];
}
return [
'permalink' => getPermaLink($fileLine),
'title' => getTitle($fileLine, $measurementArray),
'filters/tags' => getFiltersAndTags($measurementArray),
];
} catch (Exception $e) {
errorPrint("Error processing measurement at " . $fileLine . ": " . $e->getMessage());
errorPrint("Measurement content: " . substr($cleanString, 0, 500) . (strlen($cleanString) > 500 ? '...' : ''));
throw $e;
}
}
function formatAsTableRows(array $data): array
{
$result = [];
foreach ($data as $item) {
$result[] = sprintf(
'| <a id="MT-0" href="#MT-0"> MT-0 </a> | %s | Event | InfluxDB | %s | %s |',
$item['title'],
$item['permalink'],
$item['filters/tags']
);
}
return $result;
}
function processRipgrepOutput(string $input): array
{
// Remove ANSI color codes and debug messages
$input = preg_replace('/\x1B\[[0-9;]*[a-zA-Z]/', '', $input);
$input = preg_replace('/DEBUG:.*$/', '', $input);
$input = preg_replace('/\.\.\.$/', '', $input);
// Split by file:line prefixes to handle multiline matches
$lines = explode("\n", trim($input));
$result = [];
$currentFileLine = '';
$currentMeasurement = '';
$validMeasurements = 0;
foreach ($lines as $line) {
$line = rtrim($line);
if (empty($line)) {
continue;
}
// Skip debug messages and incomplete lines
if (strpos($line, 'DEBUG:') !== false || strpos($line, '...') !== false) {
continue;
}
// Check for new file:line prefix
if (preg_match('/^(.*:\d+):InfluxDb::writeMeasurement/', $line)) {
// If we have a current measurement, process it
if ($currentMeasurement !== '') {
try {
$result[] = processMeasurementBlock($currentFileLine, $currentMeasurement);
$validMeasurements++;
} catch (Exception $e) {
// Skip invalid measurements but continue processing
}
}
// Start new measurement
$parts = explode(':', $line, 3);
if (count($parts) >= 3) {
$currentFileLine = $parts[0] . ':' . $parts[1];
$currentMeasurement = trim($parts[2]);
} else {
$currentFileLine = 'unknown:0';
$currentMeasurement = $line;
}
} // If we're in a measurement, keep adding lines
else {
// Remove any leading file:line prefix if it exists
if (preg_match('/^.*:\d+:/', $line)) {
$line = preg_replace('/^.*:\d+:/', '', $line);
}
// If we're starting a new measurement with a different file, process the current one
if (preg_match('/^InfluxDb::writeMeasurement/', $line) && $currentMeasurement !== '') {
try {
$result[] = processMeasurementBlock($currentFileLine, $currentMeasurement);
$validMeasurements++;
} catch (Exception $e) {
// Skip invalid measurements but continue processing
}
$currentMeasurement = $line;
} else {
$currentMeasurement .= "\n" . $line;
}
}
}
// Process the last measurement if it exists
if ($currentMeasurement !== '') {
try {
$result[] = processMeasurementBlock($currentFileLine, $currentMeasurement);
$validMeasurements++;
} catch (Exception $e) {
// Skip invalid measurements
}
}
return formatAsTableRows($result);
}
// Get input from command line arguments or standard input
$input = '';
if ($argc > 1) {
$input = $argv[1];
} else {
$input = file_get_contents('php://stdin');
}
// Process the input
try {
$processedData = processRipgrepOutput($input);
foreach ($processedData as $row) {
echo $row . PHP_EOL;
}
} catch (Exception $e) {
errorPrint("Fatal error: " . $e->getMessage());
exit(1);
}
How the Script Works¶
Parsing Process¶
The script follows this parsing logic:
- Input Processing: Receives ripgrep output via stdin or command-line argument
- ANSI Cleanup: Removes color codes and debug messages
- Line-by-Line Parsing:
- Identifies new measurement blocks by detecting
file:line:InfluxDb::writeMeasurementpatterns - Accumulates multiline measurement calls
- Handles edge cases like truncated output
- Identifies new measurement blocks by detecting
- Data Extraction:
- Parses measurement names from the first parameter
- Extracts fields array (metrics to track)
- Extracts tags array (metadata/dimensions)
- Formatting:
- Generates GitHub permalinks using file paths and line numbers
- Creates descriptive titles from filename and measurement name
- Formats output as Markdown table rows
Key Functions¶
getPermaLink()- Constructs GitHub URLs pointing to exact source code linesgetTitle()- Generates human-readable metric titlesgetFiltersAndTags()- Extracts and formats metric dimensionsparseArrayString()- Parses PHP array syntax from measurement callsprocessMeasurementBlock()- Orchestrates parsing of individual metricsprocessRipgrepOutput()- Main processing loop handling multiline matches
Related Documentation¶
This script was used to generate the comprehensive metrics tables in:
- Branch: add-checkout-metrics
- File: docs/features/checkout-process/current-situation.md
The generated documentation includes over 100 InfluxDB metrics tracking various aspects of the checkout process, user authentication, contact forms, and payment flows.