apache-nifi
Expert guidance for Apache NiFi data integration platform including flow design, processors, controller services, process groups, NiFi Registry integration, and cluster configuration. Use this when working with data flows, processors, or NiFi configuration.
$ 安裝
git clone https://github.com/majiayu000/claude-skill-registry /tmp/claude-skill-registry && cp -r /tmp/claude-skill-registry/skills/design/apache-nifi ~/.claude/skills/claude-skill-registry// tip: Run this command in your terminal to install the skill
SKILL.md
name: apache-nifi description: Expert guidance for Apache NiFi data integration platform including flow design, processors, controller services, process groups, NiFi Registry integration, and cluster configuration. Use this when working with data flows, processors, or NiFi configuration. tags: [nifi, dataflow, data-integration] color: orange
Apache NiFi Expert Skill
You are an expert in Apache NiFi, a powerful data integration and distribution platform.
Core Concepts
Architecture
- FlowFiles: Data packages flowing through NiFi (content + attributes)
- Processors: Building blocks for data transformation and routing
- Connections: Queues between processors with backpressure
- Process Groups: Containers for organizing flows
- Controller Services: Shared services (DB connections, SSL contexts, etc.)
- Reporting Tasks: Monitoring and metrics collection
Key Processors
- Data Ingestion: GetFile, ListenHTTP, ConsumeKafka, GetFTP
- Data Egress: PutFile, InvokeHTTP, PublishKafka, PutFTP
- Routing & Transformation: RouteOnAttribute, UpdateAttribute, JoltTransformJSON
- Content: ExtractText, ReplaceText, SplitText, MergeContent
- Database: ExecuteSQL, PutSQL, QueryDatabaseTable
- Scripting: ExecuteScript (Groovy, Python, JavaScript)
Flow Design Best Practices
Organization
Process Group Structure:
├── Input Group (data ingestion)
├── Validation Group (data quality)
├── Transformation Group (ETL)
├── Enrichment Group (lookups)
└── Output Group (data egress)
Performance
- Use backpressure thresholds (object count + data size)
- Enable connection load balancing for clusters
- Set appropriate concurrent tasks per processor
- Use batching for high-volume flows
Error Handling
- Always configure failure relationships
- Use funnels to consolidate error handling
- Implement retry logic with penalization
- Log errors to provenance repository
NiFi Registry Integration
Version Control
# Connect to Registry
1. Add Registry Client in NiFi UI (Controller Settings)
2. Right-click Process Group → Version → Start version control
3. Select Registry, Bucket, and Flow name
4. Commit changes with descriptive messages
Best Practices
- Version control all production flows
- Use meaningful commit messages
- Create separate buckets per environment (dev/test/prod)
- Tag stable releases
Expression Language
Common Expressions
# Attributes
${filename} # Get attribute value
${filename:isEmpty()} # Check if empty
${filename:replace('.txt', '.csv')} # String manipulation
# Date/Time
${now():format('yyyy-MM-dd')} # Current date
${created:toDate('yyyy-MM-dd'):format('MM/dd/yyyy')} # Convert format
# Conditional
${attribute:equals('value'):ifElse('yes', 'no')} # If-else logic
# Content
${file.size} # FlowFile size
${file.size:gt(1000000)} # Size comparison
Configuration Files
nifi.properties (Key Settings)
# Web UI
nifi.web.http.host=0.0.0.0
nifi.web.http.port=8080
# Clustering
nifi.cluster.is.node=true
nifi.cluster.node.address=nifi-node1
nifi.cluster.node.protocol.port=11443
# State Management
nifi.state.management.embedded.zookeeper.start=true
nifi.zookeeper.connect.string=localhost:2181
# Repository Settings
nifi.flowfile.repository.implementation=org.apache.nifi.controller.repository.WriteAheadFlowFileRepository
nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
nifi.provenance.repository.implementation=org.apache.nifi.provenance.WriteAheadProvenanceRepository
Docker Deployment
Single Node
services:
nifi:
image: apache/nifi:latest
ports:
- "8443:8443"
environment:
- SINGLE_USER_CREDENTIALS_USERNAME=admin
- SINGLE_USER_CREDENTIALS_PASSWORD=ctsBtRBKHRAx69EqUghvvgEvjnaLjFEB
volumes:
- ./nifi/conf:/opt/nifi/nifi-current/conf
- ./nifi/content_repository:/opt/nifi/nifi-current/content_repository
- ./nifi/database_repository:/opt/nifi/nifi-current/database_repository
- ./nifi/flowfile_repository:/opt/nifi/nifi-current/flowfile_repository
- ./nifi/provenance_repository:/opt/nifi/nifi-current/provenance_repository
- ./nifi/state:/opt/nifi/nifi-current/state
Cluster Configuration
services:
zookeeper:
image: zookeeper:3.8
environment:
ZOO_MY_ID: 1
ZOO_SERVERS: server.1=zookeeper:2888:3888;2181
nifi-1:
image: apache/nifi:latest
environment:
- NIFI_CLUSTER_IS_NODE=true
- NIFI_CLUSTER_NODE_PROTOCOL_PORT=11443
- NIFI_ZK_CONNECT_STRING=zookeeper:2181
- NIFI_ELECTION_MAX_WAIT=1 min
depends_on:
- zookeeper
Security
Authentication
- Single-User: Basic username/password (dev/test)
- LDAP: Enterprise directory integration
- OIDC: OAuth2/OpenID Connect (Keycloak)
- Certificate: mTLS with client certificates
Authorization
- File-based: policies.xml
- Ranger: Apache Ranger integration
- OIDC groups: Map OIDC groups to NiFi policies
Monitoring & Troubleshooting
Key Metrics
- FlowFile counts and sizes in queues
- Processor task durations
- System diagnostics (CPU, memory, disk)
- Bulletin board for errors
Logs
# Main logs
logs/nifi-app.log # Application log
logs/nifi-user.log # User actions
logs/nifi-bootstrap.log # Bootstrap process
# Enable DEBUG logging
conf/logback.xml → Set logger level to DEBUG
Common Issues
| Issue | Solution |
|---|---|
| Out of Memory | Increase heap in bootstrap.conf |
| Disk full | Clean content/provenance repos |
| Connection timeout | Check network/firewall rules |
| FlowFile stuck | Check backpressure, processor logs |
PLC4X Integration
PLC4X Processors
- GetPLC: Read data from PLCs (Siemens S7, Modbus, OPC-UA, etc.)
- PutPLC: Write data to PLCs
Example Configuration
Connection String: s7://192.168.1.100
Fields: DB1.DBW0:INT,DB1.DBD4:REAL,DB1.DBX8.0:BOOL
Polling Interval: 1 sec
API & CLI
NiFi REST API
# Get process groups
curl -u admin:password https://localhost:8443/nifi-api/process-groups/root
# Get processor status
curl -u admin:password https://localhost:8443/nifi-api/processors/{id}/status
NiFi CLI (Toolkit)
# Export flow
nifi-toolkit/cli.sh nifi export-flow -u http://localhost:8080/nifi -b /tmp/backup.json
# Import flow
nifi-toolkit/cli.sh nifi import-flow -u http://localhost:8080/nifi -i /tmp/backup.json
Resources
Repository

majiayu000
Author
majiayu000/claude-skill-registry/skills/design/apache-nifi
0
Stars
0
Forks
Updated9h ago
Added1w ago