CI/CD Integration | DataMystic Developer
CI/CD Pipeline Integration with TextPipe
Incorporate TextPipe data transformations into your build and deployment pipelines. These examples show how to run TextPipe command-line operations within GitHub Actions and Azure DevOps pipelines using self-hosted Windows runners.
Pipeline Architecture Overview
TextPipe integrates into CI/CD pipelines as a data transformation step. Common use cases include:
- Data migration projects — Convert legacy formats (EBCDIC, fixed-width) to modern formats (CSV, JSON) as part of a deployment
- Configuration management — Transform environment-specific config files during deployment (update server names, connection strings, paths)
- Test data preparation — Generate and transform test datasets before integration tests run
- ETL pipeline stages — Run TextPipe as a transformation step between data extraction and loading
- Document processing — Batch-update document links and references during server migrations
The typical pipeline flow:
- Checkout source (including filter files and input data)
- Verify TextPipe installation on the runner
- Execute TextPipe transformation via command-line
- Validate output (file exists, non-empty, expected format)
- Deploy or archive transformed data
Runner Prerequisites
Your self-hosted runner must have:
| Requirement | Details |
|---|---|
| Operating System | Windows 10/11 or Windows Server 2016+ |
| TextPipe Pro | Installed and licensed (COM registered via regsvr32 TextPipe.dll) |
| TextPipe CLI | Available at C:\Program Files\DataMystic\TextPipe\TextPipe.exe (default install path) |
| Runner Agent | GitHub Actions self-hosted runner or Azure DevOps self-hosted agent installed as a Windows service |
| Permissions | Runner service account must have read/write access to data directories |
TextPipe Command-Line Syntax
TextPipe Pro supports command-line execution for CI/CD integration:
REM TextPipe command-line syntax
TextPipe.exe /filter:"C:\Filters\transform.fll" /input:"C:\Data\input.csv" /output:"C:\Data\output.csv" /silent /overwrite
REM Process all files in a folder
TextPipe.exe /filter:"C:\Filters\transform.fll" /inputfolder:"C:\Data\Input" /outputfolder:"C:\Data\Output" /mask:"*.csv" /recurse /silent /overwrite
REM With logging
TextPipe.exe /filter:"C:\Filters\transform.fll" /input:"C:\Data\input.csv" /output:"C:\Data\output.csv" /silent /overwrite /log:"C:\Logs\transform.log"
GitHub Actions Example
A complete GitHub Actions workflow that runs TextPipe data transformations on a self-hosted Windows runner. This example shows a data migration pipeline that converts mainframe EBCDIC files to CSV format.
# .github/workflows/data-transform.yml
# Data transformation pipeline using TextPipe on a self-hosted Windows runner
name: Data Transformation Pipeline
on:
push:
paths:
- 'data/input/**'
- 'filters/**'
workflow_dispatch:
inputs:
filter_name:
description: 'Filter file to use'
required: true
default: 'ebcdic_to_csv.fll'
jobs:
transform:
runs-on: [self-hosted, Windows, textpipe]
env:
TEXTPIPE_PATH: 'C:\Program Files\DataMystic\TextPipe\TextPipe.exe'
FILTER_DIR: '${{ github.workspace }}\filters'
INPUT_DIR: '${{ github.workspace }}\data\input'
OUTPUT_DIR: '${{ github.workspace }}\data\output'
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Verify TextPipe installation
shell: pwsh
run: |
if (-not (Test-Path $env:TEXTPIPE_PATH)) {
Write-Error "TextPipe not found at $env:TEXTPIPE_PATH"
exit 1
}
$version = & $env:TEXTPIPE_PATH /version 2>&1
Write-Host "TextPipe version: $version"
- name: Create output directory
shell: pwsh
run: |
New-Item -ItemType Directory -Path $env:OUTPUT_DIR -Force | Out-Null
- name: Run TextPipe transformation
shell: pwsh
run: |
$filter = Join-Path $env:FILTER_DIR "${{ inputs.filter_name || 'ebcdic_to_csv.fll' }}"
Write-Host "Filter: $filter"
Write-Host "Input: $env:INPUT_DIR"
Write-Host "Output: $env:OUTPUT_DIR"
& $env:TEXTPIPE_PATH `
/filter:"$filter" `
/inputfolder:"$env:INPUT_DIR" `
/outputfolder:"$env:OUTPUT_DIR" `
/mask:"*.dat" `
/recurse `
/silent `
/overwrite `
/log:"$env:OUTPUT_DIR\transform.log"
if ($LASTEXITCODE -ne 0) {
Write-Error "TextPipe exited with code $LASTEXITCODE"
Get-Content "$env:OUTPUT_DIR\transform.log" -Tail 20
exit 1
}
- name: Validate output
shell: pwsh
run: |
$outputFiles = Get-ChildItem $env:OUTPUT_DIR -Filter "*.csv" -Recurse
if ($outputFiles.Count -eq 0) {
Write-Error "No output files generated"
exit 1
}
Write-Host "Output files generated: $($outputFiles.Count)"
foreach ($file in $outputFiles) {
if ($file.Length -eq 0) {
Write-Error "Empty output file: $($file.Name)"
exit 1
}
Write-Host " $($file.Name): $($file.Length) bytes"
}
- name: Upload transformed data
uses: actions/upload-artifact@v4
with:
name: transformed-data
path: data/output/
retention-days: 30
- name: Upload transformation log
if: always()
uses: actions/upload-artifact@v4
with:
name: transform-log
path: data/output/transform.log
retention-days: 7
Azure DevOps Pipeline Example
An Azure DevOps YAML pipeline for running TextPipe transformations during a deployment. This example shows a staged pipeline with transform, validate, and deploy stages.
# azure-pipelines.yml
# Data transformation pipeline using TextPipe on a self-hosted Windows agent
trigger:
paths:
include:
- data/input/*
- filters/*
pool:
name: 'DataProcessing' # Self-hosted agent pool with TextPipe installed
demands:
- TextPipe
variables:
textpipePath: 'C:\Program Files\DataMystic\TextPipe\TextPipe.exe'
filterDir: '$(Build.SourcesDirectory)\filters'
inputDir: '$(Build.SourcesDirectory)\data\input'
outputDir: '$(Build.ArtifactStagingDirectory)\transformed'
logDir: '$(Build.ArtifactStagingDirectory)\logs'
stages:
- stage: Transform
displayName: 'Data Transformation'
jobs:
- job: RunTextPipe
displayName: 'Execute TextPipe Filters'
steps:
- checkout: self
- task: PowerShell@2
displayName: 'Verify TextPipe Installation'
inputs:
targetType: inline
script: |
if (-not (Test-Path "$(textpipePath)")) {
Write-Error "TextPipe not installed on this agent"
exit 1
}
Write-Host "TextPipe found at $(textpipePath)"
- task: PowerShell@2
displayName: 'Prepare Directories'
inputs:
targetType: inline
script: |
New-Item -ItemType Directory -Path "$(outputDir)" -Force
New-Item -ItemType Directory -Path "$(logDir)" -Force
- task: PowerShell@2
displayName: 'Run Data Transformation'
inputs:
targetType: inline
script: |
$logFile = "$(logDir)\transform_$(Get-Date -Format 'yyyyMMdd_HHmmss').log"
& "$(textpipePath)" `
/filter:"$(filterDir)\server_migration.fll" `
/inputfolder:"$(inputDir)" `
/outputfolder:"$(outputDir)" `
/mask:"*.config" `
/recurse `
/silent `
/overwrite `
/log:"$logFile"
if ($LASTEXITCODE -ne 0) {
Write-Error "TextPipe failed with exit code $LASTEXITCODE"
if (Test-Path $logFile) {
Get-Content $logFile | Select-Object -Last 30
}
exit 1
}
Write-Host "Transformation complete. Log: $logFile"
- task: PowerShell@2
displayName: 'Validate Output'
inputs:
targetType: inline
script: |
$files = Get-ChildItem "$(outputDir)" -Recurse -File
if ($files.Count -eq 0) {
Write-Error "No output files produced"
exit 1
}
$emptyFiles = $files | Where-Object { $_.Length -eq 0 }
if ($emptyFiles) {
Write-Error "Empty files detected: $($emptyFiles.Name -join ', ')"
exit 1
}
Write-Host "Validated $($files.Count) output files"
- task: PublishBuildArtifacts@1
displayName: 'Publish Transformed Data'
inputs:
PathtoPublish: '$(outputDir)'
ArtifactName: 'TransformedData'
- task: PublishBuildArtifacts@1
displayName: 'Publish Logs'
condition: always()
inputs:
PathtoPublish: '$(logDir)'
ArtifactName: 'TransformLogs'
- stage: Deploy
displayName: 'Deploy Transformed Data'
dependsOn: Transform
condition: succeeded()
jobs:
- deployment: DeployData
displayName: 'Deploy to Target'
environment: 'production-data'
strategy:
runOnce:
deploy:
steps:
- task: PowerShell@2
displayName: 'Copy to destination'
inputs:
targetType: inline
script: |
$source = "$(Pipeline.Workspace)\TransformedData"
$dest = "\\fileserver\data\production"
Copy-Item -Path "$source\*" -Destination $dest -Recurse -Force
Write-Host "Deployed to $dest"
Validation and Rollback Patterns
Add validation steps to your pipeline to catch transformation errors before deployment. Keep the original data for rollback capability.
# validation-step.ps1
# Pipeline validation script for TextPipe output
param(
[Parameter(Mandatory)]
[string]$OutputDir,
[Parameter(Mandatory)]
[string]$InputDir,
[int]$MinSizeRatio = 10 # Output must be at least 10% of input size
)
$errors = @()
$warnings = @()
# Check 1: Output files exist
$outputFiles = Get-ChildItem $OutputDir -File -Recurse
if ($outputFiles.Count -eq 0) {
$errors += "No output files found in $OutputDir"
}
# Check 2: No empty files
$emptyFiles = $outputFiles | Where-Object { $_.Length -eq 0 }
if ($emptyFiles) {
$errors += "Empty output files: $($emptyFiles.Name -join ', ')"
}
# Check 3: Size comparison
$inputFiles = Get-ChildItem $InputDir -File -Recurse
$totalInput = ($inputFiles | Measure-Object -Property Length -Sum).Sum
$totalOutput = ($outputFiles | Measure-Object -Property Length -Sum).Sum
if ($totalInput -gt 0) {
$ratio = ($totalOutput / $totalInput) * 100
if ($ratio -lt $MinSizeRatio) {
$errors += "Output is only ${ratio}% of input size (minimum: ${MinSizeRatio}%)"
}
}
# Check 4: Verify expected file count matches
if ($outputFiles.Count -lt $inputFiles.Count) {
$warnings += "Fewer output files ($($outputFiles.Count)) than input files ($($inputFiles.Count))"
}
# Report results
if ($errors.Count -gt 0) {
Write-Host "VALIDATION FAILED:" -ForegroundColor Red
$errors | ForEach-Object { Write-Error $_ }
exit 1
}
if ($warnings.Count -gt 0) {
Write-Host "VALIDATION PASSED WITH WARNINGS:" -ForegroundColor Yellow
$warnings | ForEach-Object { Write-Warning $_ }
} else {
Write-Host "VALIDATION PASSED" -ForegroundColor Green
}
Write-Host " Files: $($outputFiles.Count)"
Write-Host " Total size: $([math]::Round($totalOutput / 1MB, 2)) MB"
exit 0
Data Migration Pipeline Pattern
A complete pipeline pattern for data migration projects that use TextPipe to transform legacy data formats. This combines extraction, transformation with TextPipe, validation, and loading into a target system.
# .github/workflows/data-migration.yml
# Complete data migration pipeline: Extract -> Transform (TextPipe) -> Validate -> Load
name: Data Migration Pipeline
on:
schedule:
- cron: '0 2 * * 1-5' # Run weeknights at 2 AM
workflow_dispatch:
jobs:
extract:
runs-on: [self-hosted, Windows, textpipe]
outputs:
file_count: ${{ steps.extract.outputs.count }}
steps:
- uses: actions/checkout@v4
- name: Extract source data
id: extract
shell: pwsh
run: |
# Copy from source (network share, FTP, etc.)
$sourceDir = "\\legacy-server\exports\daily"
$extractDir = "${{ github.workspace }}\data\extracted"
New-Item -ItemType Directory -Path $extractDir -Force
Copy-Item "$sourceDir\*.dat" $extractDir -Recurse
$count = (Get-ChildItem $extractDir -File).Count
echo "count=$count" >> $env:GITHUB_OUTPUT
Write-Host "Extracted $count files"
- uses: actions/upload-artifact@v4
with:
name: extracted-data
path: data/extracted/
transform:
needs: extract
runs-on: [self-hosted, Windows, textpipe]
if: needs.extract.outputs.file_count > 0
steps:
- uses: actions/checkout@v4
- uses: actions/download-artifact@v4
with:
name: extracted-data
path: data/extracted/
- name: Transform with TextPipe
shell: pwsh
run: |
$textpipe = "C:\Program Files\DataMystic\TextPipe\TextPipe.exe"
$outputDir = "${{ github.workspace }}\data\transformed"
New-Item -ItemType Directory -Path $outputDir -Force
# Step 1: EBCDIC to ASCII conversion
& $textpipe `
/filter:"filters\ebcdic_to_ascii.fll" `
/inputfolder:"data\extracted" `
/outputfolder:"$outputDir\step1" `
/mask:"*.dat" /silent /overwrite
# Step 2: Fixed-width to CSV
& $textpipe `
/filter:"filters\fixed_to_csv.fll" `
/inputfolder:"$outputDir\step1" `
/outputfolder:"$outputDir\final" `
/mask:"*.txt" /silent /overwrite
Write-Host "Transformation pipeline complete"
- name: Validate transformed data
shell: pwsh
run: |
.\scripts\validation-step.ps1 `
-OutputDir "data\transformed\final" `
-InputDir "data\extracted"
- uses: actions/upload-artifact@v4
with:
name: transformed-data
path: data/transformed/final/
load:
needs: transform
runs-on: [self-hosted, Windows]
steps:
- uses: actions/download-artifact@v4
with:
name: transformed-data
path: data/ready/
- name: Load into target database
shell: pwsh
run: |
# Import CSV files into target system
Get-ChildItem "data\ready" -Filter "*.csv" | ForEach-Object {
Write-Host "Loading: $($_.Name)"
# Use your preferred data loading tool here
# e.g., bcp, BULK INSERT, or application-specific import
}
Write-Host "Data load complete"
Next Steps
- PowerShell Integration — Detailed PowerShell COM automation scripts
- Python Integration — Python pywin32 COM automation examples
- Task Scheduler — Schedule TextPipe with Windows Task Scheduler
- Batch Processing — Command-line batch automation patterns