Page: [root]/howto/read-song-lyrics | src | log | faq | css

A hack to rip contents from State-Of-Art Intellectual Property Protection Technology of Lyrics.Wikia.Com (Which We Have To Use Because Of Licensing Agreement)

Simple Ruby script that outputs HTML lyrics:

require 'open-uri'
puts open("http://lyrics.wikia.com/#{URI.escape(ARGV[0])}:#{URI.escape(ARGV[1])}").read.scan(%r|<div class='lyricbox'><div class='rtMatcher'>.*?</div>(.*?)<!--|m)[0][0]

Plaintext lyrics are harder:

require 'rubygems'
require 'open-uri'
require 'hpricot'

puts Hpricot.parse(open("http://lyrics.wikia.com/#{URI.escape(ARGV[0])}:#{URI.escape(ARGV[1])}")).at('div.lyricbox').children.select {|a| a.name != "div"}.map {|s| s.to_plain_text}.join("") #.to_plain_text

HTML and TXT scrappers on github.


Last edited: Voker57 on 2012-10-10 11:22:52 +0000