Thursday, December 20, 2012

Automatically download subtitles and shows

For those of You who love to watch TV shows downloaded from internet .. but hate the procedure of finding the correct torrent file and subtitles I will present a way to automate the whole process.

What we will need:
  1. Utorrent client:  http://www.utorrent.com/
  2. Python 2.7.3 http://www.python.org/download/releases/2.7.3/
  3. BeautifulSoup 3.2.1 (Python library)
      http://www.crummy.com/software/BeautifulSoup/
      http://www.crummy.com/software/BeautifulSoup/download/3.x/BeautifulSoup-3.2.1.tar.gz
  4. Periscope (Python module)
    http://code.google.com/p/periscope/

    python-periscope_0.2.4.orig.tar.gz
    (Or use my  python-periscope_0.2.4.orig.plirkee.repack)
 Target System: (in my case) Windows Vista
 Let's begin.

  •  Utorrent (briefly, cause you can find information on how to do this on many sites)
  1.  First you install Utorrent and open it.
  2.  Right click on Feeds and add  new Rss feed
    (it could be any rss feed of your favorite site mine is EzTv so I add  rss feed with feed url:  http://feeds.feedburner.com/eztv-rss-atom-feeds?format=xml )
  3. Go to Options => Rss downloader and Add new rule for your favorite show...
And thus You  are ready to  automatically download  your shows as soon as they are available! .... but, we will be back to utorrent settings to make it download subtitles as well, however first we need to set up python

  • Python
  1. install python to c:\python27
  2. download BeautifulSoup, unzip it into c:\python27\BeautifulSoup,
    run form command prompt: 
    cd c:\python27\BeautifulSoup\
    setup.py install
  3. download Periscope, unzip it into c:\python27\Periscope,
    run from command prompt:
    cd c:\python27\Periscope\
    setup.py install
  4. go to your user's folder (C:\Users\USERNAME) (this folder is defined ether by %HOME% or %USERPROFILE% or %HOMEDRIVE%%HOMEPATH% env variables) from command prompt and create directory with name ".config" e.g. 
    • cmd
    • cd C:\Users\USERNAME
    • mkdir .config
          (if you omit this step, you could get access denied error from periscope script)   
  • Vb Script
  1. create .vbs file that will handle subtitle downloading


  • c:\MyScripts\download.vbs
  • 'In v5 uTorrent kind (%K) support is added kind can be single  or multi
    Const RUN = "C:\Python27\python.exe C:\Python27\scripts\periscope"  
    Const LANGS = "-l el -l en"
    Const LOGFILE_POSTFIX = "_downloader.log"
    Const SUB_EXT = "srt"
    Const EXPECETED=3 ' utorrent pause code
    Const FORAPPENDING = 8
    Const LEAST_NUM_OF_ARG=3 ' (0) = Directory  (1) = File (2) = Kind + optional (3) = Status 
    Const KIND_MULTI = "multi"
    
    
    On Error Resume Next
    Dim directory,file,kind,extension,whatToRun 
    set objFSO = CreateObject("Scripting.FileSystemObject")
    
    Function log(logfile,str)
     set objFile = objFSO.OpenTextFile(logfile, FORAPPENDING, True)
     objFile.WriteLine("Time Stamp: " & Now)
     objFile.WriteLine(" " & str)
     objFile.Close
     set objFile=Nothing
    End Function
    
    if ( WScript.Arguments.Count LEAST_NUM_OF_ARG) then 
     if (WScript.Arguments.Count=LEAST_NUM_OF_ARG+1 and WScript.Arguments(LEAST_NUM_OF_ARG)<>CStr(EXPECETED)) then  
      'Wscript.Echo "Arg3 unexpected value" &  WScript.Arguments(2) &" " & EXPECETED
      log logfile,WScript.Arguments(1) & "Arg3 unexpected value " &  WScript.Arguments(2) &" was expecting " & EXPECETED
      WScript.quit()
     end if
     end if
     
     extension = Right(file,3)
     if kind=KIND_MULTI then
        whatToRun = RUN & " " & LANGS &" " & chr(34) &directory&           chr(34)
     else 
        whatToRun = RUN & " " & LANGS &" " & chr(34) &directory&"\" &file& chr(34) 
     if (extension<>"mp4" and extension<>"avi" and extension<>"mkv") then 
      'Wscript.Echo "unsupported extension:" &  extension
      log logfile,"unsupported extension:" &  extension
      WScript.quit()
     end if 
     end if
     
      log logfile,"Will try " & file 
      Set WshShell = WScript.CreateObject("WScript.Shell") 
      'WshShell.Run "cmd.exe /c start /min  " 
      WshShell.Run "cmd.exe /c  " &_ 
                   " echo %date% %time% " &_
          directory & " " & file & whatToRun & " >> " & logfile & " &" &_
          whatToRun &  " >> "&logfile&"  2>&1"&_
          " & exit",0,True  
       newFile = left(file,len(file)-3)&SUB_EXT
      'Wscript.Echo(newFile)
       If objFSO.FileExists(chr(34) & directory&"\"&newFile &chr(34)) Then
       'Wscript.Echo("Downloaded subtitles for " & file)
        End If 
      
      set objFSO=nothing
      Set WshShell=nothing
    




  • Set  Utorrent - Preferences - Advanced - Run Program :
    Run This program when torrent finishes
    c:\MyScripts\download.vbs "%D" "%F" %K
    Run When torrent  changes state
    c:\MyScripts\download.vbs "%D" "%F" %K %S

  •  Of course you could create your own .bat file as I did  at first. In that case I would recommend you to call it with following command: "cmd.exe /c start /min  C:\MyPythonSubsDowloader.bat "%D" "%F" %K %S ^& exit", from utorrent. The downside of using bat file is the flickering of dos window that appears every time torrent state changes.
     Also, as you can see in vbs, the  script tries to download Greek and  English subs [Const LANGS = "-l el -l en"],  you can add your own languages by changing that line.

     Any way, we have finished our task. Now any time video has finished downloading ..or whenever you hit pause on downloaded video in utorrent - subtitle search is performed.
    I use utorrent pause (state code = 3) to cause downloading of subtitles at time of my choosing ...and use it in cases when no subs are found, the first time  the torrent finished.
     
    P.S. Also, it seems that currently there are some problems with a number of plugins of periscope (v 0.2.4) .. but you can download my fixes for Addic7ed and TVsubtitles. If you do, then don't forget to un-comment  these plugins in periscope\plugins\__init__.py before executing  setup.py install 

    P.S2 I upgraded the download vb script to v5 (24/07/2013) now it downloads subs even if you are getting muti-item torrent (e.g. complete season). The old version of the script is still available @ my  support site.
    From there you can download my repack of periscope which includes all  the above fixes.

    Sunday, November 25, 2012

    PDFBox Printing Greek Problem

    Still working with PdfBox and got another obstacle to tackle...printing Pdf containing Geek text seems to be problematic. Most of the text prints out fine but there are some characters that do not print correctly.
    For example small Greek letter  π  prints out as "pi", also there is an issue with small μ that prints out differently ... thought, at least it looks like μ.

    Filled out bug report yesterday, hope this could be fixed.
    So far no solution on the horizon ...I will update this blog in case the solution is found.
    ....

    16.11.2013 
    And so it's time to update (I know, I know... millions were expecting this update!! ;-)...this issue seams to be solved by Andreas Lehmkühler https://issues.apache.org/jira/browse/PDFBOX-1452 ... however there is a catch .. the solution is in version 2.0 and currently G.A. version is 1.8.2. I suppose, one could download and build 2.0 version from the trunk  http://svn.apache.org/repos/asf/pdfbox/trunk/ (use: svn checkout http://svn.apache.org/repos/asf/pdfbox/trunk/), but I haven't tried it yet.
    If you do - let me know how it went for you.
    
    

    Thursday, November 15, 2012

    PdfBox LandScape Printing Problem and Solution

    I tried to use Apache's PdfBox library to silently print documents from my applet.
    No problem printing in "Portrait", as it was my printer's default setting...however pdfs containing Landscape pages wouldn't  print properly. The thing is that the orientation is always defined by the printer service and not by the Document itself.
    It seems that there is a "problem" with PDPageable class, specifically with it's getPageFormat method.
    So here is my temporary fix for this problem (temporary because I hope it will be fixed in following versions).
    I must mention that my solution is based on Roberto Mazzola 's  findings and proposition
    (see https://issues.apache.org/jira/browse/PDFBOX-985 for details)

    Version I used : PdfBox-1.7.1 

    Solution:
    ...
    import org.apache.pdfbox.pdmodel.PDPageable;
    ...
    public static class MyPDPageable extends PDPageable {
           PrinterJob myJob;
           PDDocument myDocument;
              public MyPDPageable() throws PrinterException{
                  super(null);
                 
              }
             
        public MyPDPageable(PDDocument document) throws PrinterException{
                  super(document);
                  this.myDocument = document;
         }
        public MyPDPageable(PDDocument document,PrinterJob job) throws PrinterException{
                  super(document,job);
                  this.myJob = job;
                  this.myDocument = document;
        }
          @Override
          public PageFormat getPageFormat(int i) throws IndexOutOfBoundsException {
            PageFormat format = myJob.defaultPage();
            List<PDPage> allPages = myDocument.getDocumentCatalog().getAllPages();
            PDPage page = allPages.get(i); // can throw IOOBE
            Dimension media = page.findMediaBox().createDimension();
            Dimension crop = page.findCropBox().createDimension();
            // Center the ImageableArea if the crop is smaller than the media
            double diffWidth = 0.0;
            double diffHeight = 0.0;
            if (!media.equals(crop)) {
                diffWidth = (media.getWidth() - crop.getWidth()) / 2.0;
                diffHeight = (media.getHeight() - crop.getHeight()) / 2.0;
            }
            int vOrientation = PageFormat.PORTRAIT;
             if(media.getWidth() > media.getHeight())
                     vOrientation = PageFormat.LANDSCAPE;
           
            format.setOrientation(vOrientation);
            Paper paper = format.getPaper();
            if (vOrientation == PageFormat.LANDSCAPE) {
               paper.setImageableArea(
                       diffHeight, diffWidth, crop.getHeight(), crop.getWidth());
               paper.setSize(media.getHeight(), media.getWidth());
            } else {
               paper.setImageableArea(
                       diffWidth, diffHeight, crop.getWidth(), crop.getHeight());
               paper.setSize(media.getWidth(), media.getHeight());
            }
            format.setPaper(paper);
            return format;
        }    
          }
    and then if I wanted to print
    .... 
    PDDocument doc = PDDocument.load(psStream,true);
     if (printService != null) {             
                PrinterJob pj =  PrinterJob.getPrinterJob();
                pj.defaultPage();
                pj.setCopies(Integer.parseInt(finalnumberOfCopies));
                pj.setPrintService(printService);
                pj.setPageable(new MyPDPageable(doc,pj));

                pj.print();
    }
    ....
    note that I don't use doc.silentPrint(pj) but pj.print() ...

    It is a quick and dirty fix, obviously code could use some improvements..but it works for me...
    Hope it will be useful and to somebody else..