π¬ YouTube Transcript & Metadata Extractor is a ready-to-run Apify actor from FlowExtract API for social media. Extract YouTube transcripts, subtitles, and complete video metadata in seconds - no manual work, no copy-paste, just pure automated data extraction. This page explains what it does, who it's for, and how to get the most out of it β then includes the full technical documentation below.
Use cases
- clean transcript: Remove "um", "uh", filler words
- include timestamps: Get second-by-second text timing
- extract comments: Pull top comments with replies (NEW!)
- repurpose videos into blogs - Extract transcript, feed to AI, generate 5 articles
Frequently asked questions
How do I run π¬ YouTube Transcript & Metadata Extractor?
Open π¬ YouTube Transcript & Metadata Extractor on the Apify Store, provide your input (URLs or search parameters), and click Start. Results are available as JSON, CSV, or Excel, and via the Apify API.
Is π¬ YouTube Transcript & Metadata Extractor free?
π¬ YouTube Transcript & Metadata Extractor runs on Apify's pay-as-you-go platform; you only pay for the compute/usage a run consumes. Check the actor page for current pricing and any free tier.
Comparison: Why Choose This Actor?
| Feature | This Actor | Manual Copy-Paste | Other Tools | |---------|------------|-------------------|-------------|
Need Help?
π§ Email: [[email protected]](mailto:[email protected])
Documentation
π¬ YouTube Transcript & Metadata Extractor
Extract YouTube transcripts, subtitles, and complete video metadata in seconds - no manual work, no copy-paste, just pure automated data extraction.
Extract transcripts from any YouTube video in 3 clicks
Paste URL β Click Start β Download data. That's it.
Why 1,000+ marketers, researchers, and developers choose this tool
| What You Get | Why It Matters |
|---|---|
| β‘ 5-second extraction | Process 100 videos while your coffee brews |
| π― 100% accurate transcripts | Official YouTube data, not AI guesses |
| π Complete metadata | Views, likes, channel info, thumbnails - everything |
| π° Free tier available | Test with 10 videos before paying anything |
| π Never extract twice | Smart caching saves time and money |
| π₯ Export anywhere | JSON, CSV, Excel, or direct API integration |
How It Works
Step 1: Paste Your URLs
{
"youtubeUrl": [
{ "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ" }
]
}
Step 2: Configure (Optional)
- Clean transcript: Remove "um", "uh", filler words
- Include timestamps: Get second-by-second text timing
- Extract comments: Pull top comments with replies (NEW!)
Step 3: Get Your Data
{
"videoId": "dQw4w9WgXcQ",
"Video_title": "Rick Astley - Never Gonna Give You Up",
"Views": "1,234,567,890 views",
"transcriptText": "We're no strangers to love...",
"channel": {
"name": "Rick Astley",
"subscribers": "3.2M subscribers"
}
}
What You Can Do With This Data
π Content Creators & Marketers
- Repurpose videos into blogs - Extract transcript, feed to AI, generate 5 articles
- Create social media posts - Pull key quotes and timestamps
- Generate SEO-optimized content - Turn video content into searchable text
- Subtitle generation - Export timestamps for perfect captions
Real example: Marketing agency extracts 100 competitor videos weekly, identifies trending topics, creates counter-content. Result: 40% traffic increase in 3 months.
π¬ Researchers & Academics
- Analyze lecture content at scale - Process entire course catalogs
- Study speaking patterns - Extract timestamps for linguistic analysis
- Sentiment analysis - Feed transcripts into NLP models
- Citation extraction - Find and verify sources mentioned in videos
Real example: PhD student analyzed 500 TED Talks in 2 days instead of 6 months, discovered key patterns in successful presentations.
π€ Developers & AI Teams
- Train chatbots - Use video transcripts as training data
- Build recommendation engines - Analyze content similarity
- Automated workflows - Trigger actions based on new videos
- Knowledge base creation - Convert video libraries into searchable databases
Real example: SaaS company built AI support bot using 1,000 tutorial video transcripts. Result: 60% reduction in support tickets.
π Business Intelligence
- Competitor monitoring - Track what competitors are saying
- Brand sentiment tracking - Analyze mentions across video content
- Market research - Extract insights from industry thought leaders
- Product feedback analysis - Process customer testimonial videos
Real example: E-commerce brand analyzes 200 review videos monthly, identifies pain points, improves product design. Result: 25% reduction in returns.
Complete Feature List
Core Extraction
- β Full video transcripts with cleaned text
- β Precise second-by-second timestamps
- β Video metadata (title, views, likes, date)
- β Channel information (name, subscribers, verification)
- β Thumbnail URLs (multiple resolutions)
- β Video descriptions and tags
- β Duration and word count analytics
Smart Processing
- β 3 cleaning levels: None (raw), Mild (remove "um"/"uh"), Aggressive (conversational cleanup)
- β Automatic deduplication: Never process the same video twice
- β Batch processing: Handle 1 to 1000+ videos
- β Error recovery: Auto-retry on temporary failures
- β Multiple formats: All YouTube URL types (standard, shorts, live, youtu.be)
NEW: Comment Extraction
- β Top comments with like counts
- β Comment replies (configurable depth)
- β Sort by "top" or "newest"
- β Real-time streaming extraction
- β Automatic resumption if interrupted
Export & Integration
- β JSON, CSV, Excel download
- β Direct API access
- β Webhook integration
- β Pre-configured data views
- β Apify platform integration
Input Configuration
Basic Example (Just Transcripts)
{
"youtubeUrl": [
{ "url": "https://www.youtube.com/watch?v=VIDEO_ID" }
],
"cleaningLevel": "mild",
"includeTimestamps": true
}
Advanced Example (With Comments)
{
"youtubeUrl": [
{ "url": "https://www.youtube.com/watch?v=VIDEO_ID" }
],
"cleaningLevel": "aggressive",
"includeTimestamps": true,
"extractcomments": true,
"sortBy": "top",
"maxComments": 50,
"maxRepliesPerComment": 5
}
All Parameters
| Parameter | Type | Default | What It Does |
|---|---|---|---|
youtubeUrl |
array | Required | List of YouTube video URLs (any format) |
cleaningLevel |
string | "mild" |
"none" (raw), "mild" (remove filler), "aggressive" (clean conversations) |
includeTimestamps |
boolean | true |
Include precise timing for each text segment |
extractcomments |
boolean | false |
Enable comment extraction (adds 10-40s per video) |
sortBy |
string | "top" |
Comment sort: "top" (most relevant) or "newest" (chronological) |
maxComments |
integer | 10 |
Max top-level comments (10-100,000) |
maxRepliesPerComment |
integer | 0 |
Max replies per comment. 0 = no replies (10x faster) |
π‘ Pro tip: Start with maxRepliesPerComment: 0 for 10x faster extraction if you don't need reply threads.
Output Structure
What You Get for Every Video
{
"videoId": "1TThGG6guf0",
"VideoURL": "https://youtu.be/1TThGG6guf0",
"Video_title": "WordPress Custom Widget Development Tutorial",
"published_Date": "Aug 12, 2020",
"Views": "5,067 views",
"likes": "122",
"channel": {
"name": "Codeytek Academy",
"id": "UC0SDxbLAqoKLACyEPz2wXAg",
"subscribers": "33.1K subscribers",
"verified": false
},
"thumbnail": "https://i.ytimg.com/vi/1TThGG6guf0/maxresdefault.jpg",
"Description": "Learn how to create custom WordPress widgets...",
"hasTranscript": true,
"transcriptText": "Hello and welcome everyone to another episode of advanced WordPress theme development. Today we're going to learn how to create custom widgets...",
"timestamps": [
{ "time": "0:08", "text": "hello and welcome everyone to another" },
{ "time": "0:10", "text": "episode of advanced wordpress theme" },
{ "time": "0:12", "text": "development today we're going to learn" }
],
"wordCount": 2847,
"estimatedDuration": "11:23"
}
With Comments Enabled
{
"videoId": "dQw4w9WgXcQ",
"Video_title": "Never Gonna Give You Up",
"transcriptText": "We're no strangers to love...",
"commentsExtracted": true,
"commentCount": 50,
"comments": [
{
"commentId": "UgxQe-6VK3h-LZaul6x4AaABAg",
"authorName": "@musiclover2024",
"text": "Still the best song after all these years!",
"likeCount": "1,543",
"replyCount": 12,
"publishedTime": "2 days ago",
"replies": [
{
"commentId": "UgxQe-6VK3h-LZaul6x4AaABAg.9kF7...",
"authorName": "@throwback90s",
"text": "Facts! Never gets old.",
"likeCount": "234",
"publishedTime": "1 day ago"
}
]
}
]
}
Pricing & Performance
Transcript Extraction
- Free mode: 5-10 seconds per video
- Paid mode: 3-5 seconds per video (faster infrastructure)
- Cost: ~$0.001-0.006 per video (depending on length)
Comment Extraction (Optional Add-on)
Uses YouTube Comments Scraper in Standby Mode.
Pricing:
- Actor start: $0.001 (once per run)
- Parent comments: $0.003 each
- Replies: $0.0015 each
Example cost for 50 comments + 100 replies:
- Start: $0.01
- Comments: 50 Γ $0.003 = $0.15
- Replies: 100 Γ $0.0015 = $0.15
- Total: $0.31
With Apify subscription discounts:
- Bronze: 50% off β $0.17 total
- Silver: 67% off β $0.13 total
- Gold: 73% off β $0.11 total
Speed impact:
- Without replies: +5-10 seconds per video
- With replies (10 per comment): +20-40 seconds per video
π‘ Cost optimization tip: Set maxRepliesPerComment: 0 if you don't need reply threads - you'll get 10x faster extraction and cut costs in half.
Supported YouTube URL Formats
We handle ALL YouTube URL types:
β
https://www.youtube.com/watch?v=VIDEO_ID
β
https://youtu.be/VIDEO_ID
β
https://www.youtube.com/shorts/VIDEO_ID
β
https://www.youtube.com/live/VIDEO_ID
β
https://youtube.com/watch?v=VIDEO_ID (no www)
β
https://m.youtube.com/watch?v=VIDEO_ID (mobile)
Just paste any YouTube link - we'll figure it out.
Pre-Configured Data Views
Save time with our built-in export templates:
1. π Full Dataset
Everything - Complete metadata, transcripts, timestamps, analytics
Use for: Comprehensive analysis, data warehousing
2. π Transcripts Only
Focus: Transcript text, timestamps, word count, duration
Use for: Content repurposing, subtitle generation
3. πΊ Channel Analytics
Focus: Channel info, subscribers, verification, video list
Use for: Influencer research, competitor analysis
Quick Start Examples
Use Case 1: Content Repurposing
Goal: Turn video into blog post
// 1. Extract transcript
const input = {
youtubeUrl: [{ url: "YOUR_VIDEO_URL" }],
cleaningLevel: "aggressive",
includeTimestamps: false
};
// 2. Run actor
// 3. Get output: feed transcriptText to ChatGPT/Claude
// 4. Generate 5 blog posts in 2 minutes
Use Case 2: Competitor Analysis
Goal: Analyze 100 competitor videos
const input = {
youtubeUrl: [
{ url: "competitor_video_1" },
{ url: "competitor_video_2" },
// ... paste 100 URLs
],
cleaningLevel: "mild",
extractcomments: true,
maxComments: 20
};
// Export to Excel β Analyze trends β Find content gaps
Use Case 3: AI Training Data
Goal: Build chatbot training dataset
const input = {
youtubeUrl: YOUR_PLAYLIST_URLS, // from our Playlist Extractor
cleaningLevel: "none", // keep raw data for AI
includeTimestamps: true
};
// Process 1000 videos β Clean dataset β Train model
Integration Examples
JavaScript/Node.js
const ApifyClient = require('apify-client');
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const input = {
youtubeUrl: [
{ url: "https://www.youtube.com/watch?v=dQw4w9WgXcQ" }
],
cleaningLevel: "mild"
};
const run = await client.actor("dz_omar/youtube-transcript-metadata-extractor").call(input);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items[0].transcriptText);
Python
from apify_client import ApifyClient
client = ApifyClient('YOUR_API_TOKEN')
run_input = {
"youtubeUrl": [
{ "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ" }
],
"cleaningLevel": "mild"
}
run = client.actor("dz_omar/youtube-transcript-metadata-extractor").call(run_input=run_input)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item['transcriptText'])
cURL (Direct API)
curl -X POST https://api.apify.com/v2/acts/dz_omar~youtube-transcript-metadata-extractor/runs \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-d '{
"youtubeUrl": [
{ "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ" }
]
}'
β View full API documentation
Frequently Asked Questions
β What if a video doesn't have a transcript?
The actor will return "hasTranscript": false and still provide all other metadata (title, views, channel info, etc.). YouTube auto-generates transcripts for most videos, but some may not have them available.
β Can I extract from private or age-restricted videos?
No. This actor only works with publicly available videos. Private, members-only, or age-restricted content cannot be accessed.
β How accurate are the transcripts?
We extract the official transcripts provided by YouTube. Accuracy depends on whether the creator uploaded manual captions (99% accurate) or YouTube auto-generated them (85-95% accurate). We don't modify or AI-generate transcripts - you get exactly what YouTube provides.
β Can I process entire playlists or channels?
This actor processes individual videos. For bulk extraction, use our YouTube Playlist Extractor to get all video URLs, then feed them to this actor.
β What's the difference between cleaning levels?
- None: Raw transcript as YouTube provides it
- Mild: Removes filler words ("um", "uh", "you know")
- Aggressive: Removes filler words + conversational redundancy ("so basically", "I mean", etc.)
Most users choose "mild" for the best balance.
β Do I need comments? Should I enable replies?
Enable comments if: You need engagement insights, sentiment analysis, or customer feedback.
Enable replies if: You need full conversation threads. Warning: Replies significantly increase runtime and cost.
Fastest option: extractcomments: false - just transcripts in 5-10 seconds.
β Is this legal? Will I get in trouble?
This actor extracts publicly available data that anyone can see on YouTube. It's equivalent to manually copying text. However:
- β Extracting public transcripts: Generally okay
- β Using for research, analysis, personal use: Generally okay
- β οΈ Republishing copyrighted content: Check copyright laws
- β οΈ Commercial use at scale: Review YouTube's Terms of Service
We are not lawyers. Consult legal counsel for your specific use case.
β Why did my run fail?
Common reasons:
- Video is private, deleted, or age-restricted
- Video has no transcript available
- Invalid URL format
- Temporary YouTube API issues (auto-retry handles this)
Check the run log for specific error messages.
β How many videos can I process at once?
Technical limit: 10,000 URLs per run
Recommended: Start with 10-50 to test, then scale up
Performance: Free mode handles 100 videos in ~10 minutes. Paid mode is 2x faster.
Technical Details
π§ For Developers: Architecture & Performance
Tech Stack
- Crawler: CheerioCrawler (optimized for speed)
- Language: Node.js 20
- Memory: 128MB-512MB (auto-scales)
- Storage: Persistent caching with key-value store
- API Format: RESTful JSON endpoint
Performance Specs
- Throughput: 500-1000 videos/hour (paid mode)
- Latency: 5-10s per video (free), 3-5s (paid)
- Concurrency: 10 parallel requests
- Retry Logic: 3 attempts with exponential backoff
- Cache Hit Rate: ~40% (typical usage)
Comment Integration
- Method: Standby Mode API call to YouTube Comments Scraper
- Protocol: NDJSON streaming over HTTP
- Resumption: Automatic from last successful comment
- Timeout: 5 minutes per video (configurable)
Data Processing
- Transcript cleaning: Regex + NLP tokenization
- Timestamp parsing: ISO 8601 β human-readable
- Deduplication: SHA-256 hash of video ID
- Output format: Minified JSON (reduce bandwidth)
Rate Limiting
- YouTube API: Respects official rate limits
- Apify Platform: Auto-throttles to prevent blocking
- Proxy support: Automatic rotation (paid tier)
Error Handling
// Auto-retry logic
const MAX_RETRIES = 3;
const BACKOFF_MS = [1000, 3000, 9000];
try {
await extractTranscript(videoId);
} catch (error) {
if (retries < MAX_RETRIES) {
await sleep(BACKOFF_MS[retries]);
// retry...
}
}
Security
- β No authentication required (public data only)
- β HTTPS-only API calls
- β Input sanitization (XSS prevention)
- β No data retention beyond run duration
Comparison: Why Choose This Actor?
| Feature | This Actor | Manual Copy-Paste | Other Tools |
|---|---|---|---|
| Speed | 5-10 sec/video | 5-10 min/video | 30-60 sec/video |
| Accuracy | 100% (official data) | 100% | 70-90% (AI guessing) |
| Batch processing | β 1000+ videos | β One at a time | β οΈ Limited (10-50) |
| Timestamps | β Second-precise | β Manual work | β οΈ Minute-level only |
| Metadata | β Everything | β Manual scraping | β οΈ Basic only |
| Comments | β With replies | β Screenshot only | β Not available |
| API access | β Full REST API | β None | β οΈ Limited |
| Cost | $0.001-0.006/video | $5-10/video (labor) | $0.10-0.50/video |
| Setup time | 1 minutes | N/A | 30-60 minutes |
Trust & Reliability
Platform Performance
- β Actor running smoothly - 100% success rate
- β Regular updates - Maintained actively
Legal & Compliance
What You Should Know
β What's allowed:
- Extracting public transcripts for personal use
- Research and academic analysis
- Business intelligence and market research
- Content repurposing with proper attribution
β οΈ What requires caution:
- Large-scale commercial redistribution
- Republishing copyrighted video content
- Using data in ways that violate YouTube's ToS
- Scraping private or restricted content
π Privacy & Data:
- We only extract publicly visible data
- No authentication or login required
- No data stored beyond your run duration
- GDPR compliant (EU users)
Disclaimer: This tool extracts publicly available data. Users are responsible for ensuring their usage complies with YouTube's Terms of Service, copyright laws, and applicable regulations. We are not lawyers - consult legal counsel for commercial use cases.
Related Tools from FlowExtract API
Build your complete YouTube data pipeline with our specialized actors:
π¬ Video & Content Tools
YouTube Playlist Extractor
Extract all videos from playlists in seconds. Get video URLs, titles, durations. Perfect for feeding into this transcript extractor.
β Use together: Playlist β Transcript Extractor = Full channel analysis
YouTube Channel Scraper Pro
Complete channel extraction: videos, shorts, live streams, playlists. Comprehensive creator analytics.
β Use together: Channel Scraper β Transcript Extractor = Creator deep-dive
YouTube Comments Scraper
Standalone comment extraction with advanced filtering. Perfect for sentiment analysis.
β Integrated in this actor - enable with extractcomments: true
πΉ Video Platform Tools
Zoom Scraper | Downloader & Transcript
Extract Zoom meeting recordings and transcripts. Perfect for meeting analysis.
Loom Scraper | Downloader & Transcript
Download Loom videos and extract transcripts. Ideal for training content.
π Real Estate Data
Idealista Scraper API
Spanish real estate listings with API access. Property data at scale.
π οΈ Developer Tools
Screenshot | Ultimate Screenshot
Webpage screenshots with custom options. Perfect for monitoring and documentation.
Network Security Scanner
Website security vulnerability scanning. Comprehensive security reports.
β View all FlowExtract API tools
Get Started Now
Free Trial Available
No credit card required. Test with 5$ to see the quality yourself.
Need Help?
π§ Email: [email protected]
π Website: flowextractapi.com
π¦ Twitter: @FlowExtractAPI
πΌ LinkedIn: flowextract-api
Response time: Within 24 hours (usually much faster)
