<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix">

      <p class="MsoNormal"

        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span

          style="font-size:12.0pt;font-family:"Times New

          Roman","serif";mso-fareast-language:FR-CA"

          lang="EN-US">Hi Jean-Nicolas</span><span lang="EN-US"><o:p></o:p></span>,<br>

      </p>

      <p class="MsoNormal"

        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><br>

      </p>

      <p class="MsoNormal"

        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Sounds

        like your sequence files contains some weird characters that

        Blast chokes on.  Use "od -c" to check, there should only be

        normal characters and \n in there.</p>

      <p class="MsoNormal"

        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><br>

        /Erik<br>

      </p>

      <br>

      On 2014-03-27 13:59, Jean-Nicolas Audet, Mr wrote:<br>

    </div>

    <blockquote

cite="mid:DD6FB956AE24FC45AFDB1385907E819073DDF1@EXMBX2010-3.campus.MCGILL.CA"

      type="cite">

      <meta http-equiv="Content-Type" content="text/html;

        charset=ISO-8859-1">

      <meta name="Generator" content="Microsoft Word 14 (filtered

        medium)">

      <style><!--

/* Font Definitions */

@font-face

        {font-family:"Cambria Math";

        panose-1:2 4 5 3 5 4 6 3 2 4;}

@font-face

        {font-family:Calibri;

        panose-1:2 15 5 2 2 2 4 3 2 4;}

@font-face

        {font-family:Tahoma;

        panose-1:2 11 6 4 3 5 4 4 2 4;}

/* Style Definitions */

p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0cm;

        margin-bottom:.0001pt;

        font-size:11.0pt;

        font-family:"Calibri","sans-serif";

        mso-fareast-language:EN-US;}

a:link, span.MsoHyperlink

        {mso-style-priority:99;

        color:blue;

        text-decoration:underline;}

a:visited, span.MsoHyperlinkFollowed

        {mso-style-priority:99;

        color:purple;

        text-decoration:underline;}

p

        {mso-style-priority:99;

        mso-margin-top-alt:auto;

        margin-right:0cm;

        mso-margin-bottom-alt:auto;

        margin-left:0cm;

        font-size:12.0pt;

        font-family:"Times New Roman","serif";}

p.MsoAcetate, li.MsoAcetate, div.MsoAcetate

        {mso-style-priority:99;

        mso-style-link:"Texte de bulles Car";

        margin:0cm;

        margin-bottom:.0001pt;

        font-size:8.0pt;

        font-family:"Tahoma","sans-serif";

        mso-fareast-language:EN-US;}

span.TextedebullesCar

        {mso-style-name:"Texte de bulles Car";

        mso-style-priority:99;

        mso-style-link:"Texte de bulles";

        font-family:"Tahoma","sans-serif";}

span.EmailStyle20

        {mso-style-type:personal;

        font-family:"Calibri","sans-serif";

        color:windowtext;}

span.EmailStyle21

        {mso-style-type:personal-reply;

        font-family:"Calibri","sans-serif";

        color:windowtext;}

.MsoChpDefault

        {mso-style-type:export-only;

        font-size:10.0pt;}

@page WordSection1

        {size:612.0pt 792.0pt;

        margin:72.0pt 90.0pt 72.0pt 90.0pt;}

div.WordSection1

        {page:WordSection1;}

--></style><!--[if gte mso 9]><xml>

<o:shapedefaults v:ext="edit" spidmax="1026" />

</xml><![endif]--><!--[if gte mso 9]><xml>

<o:shapelayout v:ext="edit">

<o:idmap v:ext="edit" data="1" />

</o:shapelayout></xml><![endif]-->

      <div class="WordSection1">

        <p><span lang="EN-US">Hello,<o:p></o:p></span></p>

        <p><span lang="EN-US">I’ve been trying to make InParanoid work

            using de novo transcriptomes of two non-model species

            (birds) I assembled with Trinity. They were 'translated' to

            protein sequences using Transdecoder. The resulting fasta

            file I'm trying to use in InParanoid looks like this (~30

            000 seqs):<o:p></o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:10.0pt;font-family:"Courier

            New";mso-fareast-language:FR-CA" lang="EN-US">>comp100291_c0_seq1<o:p></o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:10.0pt;font-family:"Courier

            New";mso-fareast-language:FR-CA" lang="EN-US">LPKKILLPIQQVLGHLLLALSYRGKVMQVKALKSKHEHNGPETLDAFLSSKLVVVKQPRE<o:p></o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:10.0pt;font-family:"Courier

            New";mso-fareast-language:FR-CA" lang="EN-US">QAGFPLSIVFIPGEGRQERFLLHGEYNQSFCKEPVMELPRQ<o:p></o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:10.0pt;font-family:"Courier

            New";mso-fareast-language:FR-CA" lang="EN-US">>comp102162_c0_seq1<o:p></o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:10.0pt;font-family:"Courier

            New";mso-fareast-language:FR-CA" lang="EN-US">PNMTLHFLKSSPGSWRLSGLVLIPYVTETISGSCETLTRLQMPAHIQQSRWKAKHGPRIL<o:p></o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:10.0pt;font-family:"Courier

            New";mso-fareast-language:FR-CA" lang="EN-US">LLGLLQNLRSLFPLKVLPPGANSQLKRNCSFTSVCLIGTFYVESS<o:p></o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:10.0pt;font-family:"Courier

            New";mso-fareast-language:FR-CA" lang="EN-US">>comp102206_c0_seq1<o:p></o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:10.0pt;font-family:"Courier

            New";mso-fareast-language:FR-CA" lang="EN-US">CQEQKWQKGNREEKGWAGVTVWGAYFPYLLIRCPNHQTSTPLSIHSQQHFMLCIIICPFS<o:p></o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:10.0pt;font-family:"Courier

            New";mso-fareast-language:FR-CA" lang="EN-US">WLKPPVKTTQMFKGFFFKSGLKKFLALFLISWAAFATDRPLLGKQQSR<o:p></o:p></span></p>

        <p class="MsoNormal"

          style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span

            style="font-size:12.0pt;font-family:"Times New

            Roman","serif";mso-fareast-language:FR-CA"

            lang="EN-US">I tried the example fasta files supplied with

            the program (called SC and EC) and it works, but when I use

            my files, it's stuck at the first step and it does not

            create any file (nor disk usage) after days. Here is what I

            get with my fasta files:<o:p></o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:10.0pt;font-family:"Courier

            New";mso-fareast-language:FR-CA" lang="EN-US">Loading

            module bio/ncbi-blast-2.2.22.<o:p></o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:10.0pt;font-family:"Courier

            New";mso-fareast-language:FR-CA" lang="EN-US">Formatting

            BLAST databases<o:p></o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:10.0pt;font-family:"Courier

            New";mso-fareast-language:FR-CA" lang="EN-US">Done

            formatting<o:p></o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:10.0pt;font-family:"Courier

            New";mso-fareast-language:FR-CA" lang="EN-US">Starting

            BLAST searches...<o:p></o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:10.0pt;font-family:"Courier

            New";mso-fareast-language:FR-CA" lang="EN-US"><o:p> </o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:10.0pt;font-family:"Courier

            New";mso-fareast-language:FR-CA" lang="EN-US">Starting

            first BLAST pass for bf - bf on [blastall] WARNING: the -C 3

            argument is currently experimental<o:p></o:p></span></p>

        <p class="MsoNormal"

          style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span

            style="font-size:12.0pt;font-family:"Times New

            Roman","serif";mso-fareast-language:FR-CA"

            lang="EN-US">It then stays like this forever (I tried up to

            6 days with 24 CPUs and 256G).<o:p></o:p></span></p>

        <p class="MsoNormal"

          style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span

            style="font-size:12.0pt;font-family:"Times New

            Roman","serif";mso-fareast-language:FR-CA"

            lang="EN-US">I also tried supplying my Blast results

            (inter-sample) generated myself that I parsed with their

            supplied parser but then it still stays forever at the same

            state, again without generating any file:<o:p></o:p></span></p>

        <p class="MsoNormal"><span

            style="font-size:10.0pt;font-family:"Courier

            New";mso-fareast-language:FR-CA" lang="EN-US">Done

            BLAST searches. Starting ortholog detection...<o:p></o:p></span></p>

        <p class="MsoNormal"

          style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span

            style="font-size:12.0pt;font-family:"Times New

            Roman","serif";mso-fareast-language:FR-CA"

            lang="EN-US">I tried with and without bootstraping,

            multitreading (-a16 option) or not, as I said with or

            without supplied blast results and I also cleaned my fasta

            files for any weird characters (removed annotations, all ' *

            ', spaces, empty lines and dots. Now I'm running out of

            ideas... I'm using a Unix cluster. I tried these jobs using

            up to 4 to 24 CPUs with 8 to 256G memory.<o:p></o:p></span></p>

        <p class="MsoNormal"

          style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span

            style="font-size:12.0pt;font-family:"Times New

            Roman","serif";mso-fareast-language:FR-CA"

            lang="EN-US">Finally, since it was working with SC and EC, I

            tried with a small subset of my transcriptomes (a few

            thousands sequences) and it worked. Thus it seems to me that

            the problem could be that InParanoid cannot take more than,

            say, 10000 sequences. I could split my transcriptomes into

            several smaller files of specified sequence ranges but then

            the orthologs that do not have the exact same length will

            have a chance to be missed if they are in two different

            split files.<o:p></o:p></span></p>

        <p class="MsoNormal"

          style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span

            style="font-size:12.0pt;font-family:"Times New

            Roman","serif";mso-fareast-language:FR-CA"

            lang="EN-US">Thanks in advance for your help,<o:p></o:p></span></p>

        <p class="MsoNormal"

          style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span

            style="font-size:12.0pt;font-family:"Times New

            Roman","serif";mso-fareast-language:FR-CA"

            lang="EN-US">Jean-Nicolas</span><span lang="EN-US"><o:p></o:p></span></p>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

InParanoid mailing list

<a class="moz-txt-link-abbreviated" href="mailto:InParanoid@lists.su.se">InParanoid@lists.su.se</a>

<a class="moz-txt-link-freetext" href="https://lists.su.se/mailman/listinfo/inparanoid-at-sbc.su.se">https://lists.su.se/mailman/listinfo/inparanoid-at-sbc.su.se</a>

</pre>

    </blockquote>

    <br>

  </body>

</html>