YouTube videos often contain valuable information, and extracting transcripts can be useful for various purposes, from accessibility to content analysis. In this article, we'll explore how to use JavaScript to retrieve and process YouTube transcripts.
Prerequisites
Before diving into the code, make sure you have a basic understanding of JavaScript and the Document Object Model (DOM). Additionally, you'll need a YouTube video ID, which you can extract from the video URL.
Javascript Code
function reteriveTranscript() {
const videoId = new URLSearchParams(window.location.search).get('v');
const YT_INITIAL_PLAYER_RESPONSE_RE =
/ytInitialPlayerResponse\s*=\s*({.+?})\s*;\s*(?:var\s+(?:meta|head)|<\/script|\n)/;
let player = window.ytInitialPlayerResponse;
if (!player || videoID !== player.videoDetails.videoId) {
fetch('https://www.youtube.com/watch?v=' + videoId)
.then(function (response) {
return response.text();
})
.then(function (body) {
const playerResponse = body.match(YT_INITIAL_PLAYER_RESPONSE_RE);
if (!playerResponse) {
console.warn('Unable to parse playerResponse');
return;
}
player = JSON.parse(playerResponse[1]);
const metadata = {
title: player.videoDetails.title,
duration: player.videoDetails.lengthSeconds,
author: player.videoDetails.author,
views: player.videoDetails.viewCount,
};
// Get the tracks and sort them by priority
const tracks = player.captions.playerCaptionsTracklistRenderer.captionTracks;
tracks.sort(compareTracks);
// Get the transcript
fetch(tracks[0].baseUrl + '&fmt=json3')
.then(function (response) {
return response.json();
})
.then(function (transcript) {
const result = { transcript: transcript, metadata: metadata };
const parsedTranscript = transcript.events
// Remove invalid segments
.filter(function (x) {
return x.segs;
})
// Concatenate into single long string
.map(function (x) {
return x.segs
.map(function (y) {
return y.utf8;
})
.join(' ');
})
.join(' ')
// Remove invalid characters
.replace(/[\u200B-\u200D\uFEFF]/g, '')
// Replace any whitespace with a single space
.replace(/\s+/g, ' ');
// Use 'result' here as needed
console.log('EXTRACTED_TRANSCRIPT', parsedTranscript);
});
});
}
}
function compareTracks(track1, track2) {
const langCode1 = track1.languageCode;
const langCode2 = track2.languageCode;
if (langCode1 === 'en' && langCode2 !== 'en') {
return -1; // English comes first
} else if (langCode1 !== 'en' && langCode2 === 'en') {
return 1; // English comes first
} else if (track1.kind !== 'asr' && track2.kind === 'asr') {
return -1; // Non-ASR comes first
} else if (track1.kind === 'asr' && track2.kind !== 'asr') {
return 1; // Non-ASR comes first
}
return 0; // Preserve order if both have same priority
}
Code Explanation
retrieveTranscript
FunctionRetrieves the video ID from the query parameters of the current URL.
const videoId = new URLSearchParams(window.location.search).get('v');
Defines a regular expression to extract the ytInitialPlayerResponse object from the YouTube video page's HTML.
const YT_INITIAL_PLAYER_RESPONSE_RE = /ytInitialPlayerResponse\s*=\s*({.+?})\s*;\s*(?:var\s+(?:meta|head)|<\/script|\n)/;
Checks if the ytInitialPlayerResponse object is available and matches the current video ID. If not, it fetches the YouTube video page HTML to extract the necessary information.
if (!player || videoId !== player.videoDetails.videoId) { /* ... */ }
Fetches the HTML content of the YouTube video page.
fetch('https://www.youtube.com/watch?v=' + videoId) .then(response => response.text()) .then(body => { /* ... */ });
Extracts and parses the
ytInitialPlayerResponse
object from the HTML.const playerResponse = body.match(YT_INITIAL_PLAYER_RESPONSE_RE); player = JSON.parse(playerResponse[1]);
Extracts metadata such as video title, duration, author, and views from the player
const metadata = { /* ... */ };
Sorts caption tracks based on language and type, then fetches the transcript data using the selected track's base URL.
const tracks = player.captions.playerCaptionsTracklistRenderer.captionTracks; tracks.sort(compareTracks); fetch(tracks[0].baseUrl + '&fmt=json3') .then(response => response.json()) .then(transcript => { /* ... */ });
Processes the raw transcript data by filtering out invalid segments, mapping the UTF-8 text, and cleaning up unnecessary characters and whitespace.
const parsedTranscript = transcript.events .filter(x => x.segs) .map(x => x.segs.map(y => y.utf8).join(' ')) .join(' ') .replace(/[\u200B-\u200D\uFEFF]/g, '') .replace(/\s+/g, ' ');
Generates the final content, including metadata, transcript, and placeholders for additional instructions.
const parsedTranscript = [ /* ... */ ].join('\n');
Sorting caption tracks sorts the available caption tracks, giving priority to English and non-automatic speech recognition (ASR) tracks.
Usage
To use the script, embed it in your project and call the retrieveTranscript
function. Ensure that the YouTube API response format remains consistent, as the script relies on the structure of the ytInitialPlayerResponse
object.
Summary
With this JavaScript code, you can easily retrieve and process YouTube transcripts programmatically. Feel free to adapt the code to suit your specific requirements or integrate it into web applications for enhanced functionality.
Happy Coding !!!