<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:12.0pt;font-family:"Times New
Roman","serif";mso-fareast-language:FR-CA"
lang="EN-US">Hi Jean-Nicolas</span><span lang="EN-US"><o:p></o:p></span>,<br>
</p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><br>
</p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Sounds
like your sequence files contains some weird characters that
Blast chokes on. Use "od -c" to check, there should only be
normal characters and \n in there.</p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><br>
/Erik<br>
</p>
<br>
On 2014-03-27 13:59, Jean-Nicolas Audet, Mr wrote:<br>
</div>
<blockquote
cite="mid:DD6FB956AE24FC45AFDB1385907E819073DDF1@EXMBX2010-3.campus.MCGILL.CA"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<meta name="Generator" content="Microsoft Word 14 (filtered
medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p
{mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
p.MsoAcetate, li.MsoAcetate, div.MsoAcetate
{mso-style-priority:99;
mso-style-link:"Texte de bulles Car";
margin:0cm;
margin-bottom:.0001pt;
font-size:8.0pt;
font-family:"Tahoma","sans-serif";
mso-fareast-language:EN-US;}
span.TextedebullesCar
{mso-style-name:"Texte de bulles Car";
mso-style-priority:99;
mso-style-link:"Texte de bulles";
font-family:"Tahoma","sans-serif";}
span.EmailStyle20
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:windowtext;}
span.EmailStyle21
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p><span lang="EN-US">Hello,<o:p></o:p></span></p>
<p><span lang="EN-US">I’ve been trying to make InParanoid work
using de novo transcriptomes of two non-model species
(birds) I assembled with Trinity. They were 'translated' to
protein sequences using Transdecoder. The resulting fasta
file I'm trying to use in InParanoid looks like this (~30
000 seqs):<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Courier
New";mso-fareast-language:FR-CA" lang="EN-US">>comp100291_c0_seq1<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Courier
New";mso-fareast-language:FR-CA" lang="EN-US">LPKKILLPIQQVLGHLLLALSYRGKVMQVKALKSKHEHNGPETLDAFLSSKLVVVKQPRE<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Courier
New";mso-fareast-language:FR-CA" lang="EN-US">QAGFPLSIVFIPGEGRQERFLLHGEYNQSFCKEPVMELPRQ<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Courier
New";mso-fareast-language:FR-CA" lang="EN-US">>comp102162_c0_seq1<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Courier
New";mso-fareast-language:FR-CA" lang="EN-US">PNMTLHFLKSSPGSWRLSGLVLIPYVTETISGSCETLTRLQMPAHIQQSRWKAKHGPRIL<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Courier
New";mso-fareast-language:FR-CA" lang="EN-US">LLGLLQNLRSLFPLKVLPPGANSQLKRNCSFTSVCLIGTFYVESS<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Courier
New";mso-fareast-language:FR-CA" lang="EN-US">>comp102206_c0_seq1<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Courier
New";mso-fareast-language:FR-CA" lang="EN-US">CQEQKWQKGNREEKGWAGVTVWGAYFPYLLIRCPNHQTSTPLSIHSQQHFMLCIIICPFS<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Courier
New";mso-fareast-language:FR-CA" lang="EN-US">WLKPPVKTTQMFKGFFFKSGLKKFLALFLISWAAFATDRPLLGKQQSR<o:p></o:p></span></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:12.0pt;font-family:"Times New
Roman","serif";mso-fareast-language:FR-CA"
lang="EN-US">I tried the example fasta files supplied with
the program (called SC and EC) and it works, but when I use
my files, it's stuck at the first step and it does not
create any file (nor disk usage) after days. Here is what I
get with my fasta files:<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Courier
New";mso-fareast-language:FR-CA" lang="EN-US">Loading
module bio/ncbi-blast-2.2.22.<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Courier
New";mso-fareast-language:FR-CA" lang="EN-US">Formatting
BLAST databases<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Courier
New";mso-fareast-language:FR-CA" lang="EN-US">Done
formatting<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Courier
New";mso-fareast-language:FR-CA" lang="EN-US">Starting
BLAST searches...<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Courier
New";mso-fareast-language:FR-CA" lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Courier
New";mso-fareast-language:FR-CA" lang="EN-US">Starting
first BLAST pass for bf - bf on [blastall] WARNING: the -C 3
argument is currently experimental<o:p></o:p></span></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:12.0pt;font-family:"Times New
Roman","serif";mso-fareast-language:FR-CA"
lang="EN-US">It then stays like this forever (I tried up to
6 days with 24 CPUs and 256G).<o:p></o:p></span></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:12.0pt;font-family:"Times New
Roman","serif";mso-fareast-language:FR-CA"
lang="EN-US">I also tried supplying my Blast results
(inter-sample) generated myself that I parsed with their
supplied parser but then it still stays forever at the same
state, again without generating any file:<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Courier
New";mso-fareast-language:FR-CA" lang="EN-US">Done
BLAST searches. Starting ortholog detection...<o:p></o:p></span></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:12.0pt;font-family:"Times New
Roman","serif";mso-fareast-language:FR-CA"
lang="EN-US">I tried with and without bootstraping,
multitreading (-a16 option) or not, as I said with or
without supplied blast results and I also cleaned my fasta
files for any weird characters (removed annotations, all ' *
', spaces, empty lines and dots. Now I'm running out of
ideas... I'm using a Unix cluster. I tried these jobs using
up to 4 to 24 CPUs with 8 to 256G memory.<o:p></o:p></span></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:12.0pt;font-family:"Times New
Roman","serif";mso-fareast-language:FR-CA"
lang="EN-US">Finally, since it was working with SC and EC, I
tried with a small subset of my transcriptomes (a few
thousands sequences) and it worked. Thus it seems to me that
the problem could be that InParanoid cannot take more than,
say, 10000 sequences. I could split my transcriptomes into
several smaller files of specified sequence ranges but then
the orthologs that do not have the exact same length will
have a chance to be missed if they are in two different
split files.<o:p></o:p></span></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:12.0pt;font-family:"Times New
Roman","serif";mso-fareast-language:FR-CA"
lang="EN-US">Thanks in advance for your help,<o:p></o:p></span></p>
<p class="MsoNormal"
style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span
style="font-size:12.0pt;font-family:"Times New
Roman","serif";mso-fareast-language:FR-CA"
lang="EN-US">Jean-Nicolas</span><span lang="EN-US"><o:p></o:p></span></p>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
InParanoid mailing list
<a class="moz-txt-link-abbreviated" href="mailto:InParanoid@lists.su.se">InParanoid@lists.su.se</a>
<a class="moz-txt-link-freetext" href="https://lists.su.se/mailman/listinfo/inparanoid-at-sbc.su.se">https://lists.su.se/mailman/listinfo/inparanoid-at-sbc.su.se</a>
</pre>
</blockquote>
<br>
</body>
</html>