[Primeusers] PrimeGSR memory usage

Tue Mar 17 09:29:16 CET 2009

Sorry, I missed some words: I meant that it could be good to try a 
problem size inbetween the two datasets you have now. I would take a 
look at the species tree to see where the short branches are and try to 
remove those species that "cause" the short branches in the species tree.

    Lasse

Filipe Garrett wrote:
> The smaller dataset worked fine but I had to scale the tree root to 1.0 and use the Uniform 
> distribution for the edge rates. Otherwise, I had to increase the "-c" option too much and it would 
> take too long...
>
> What do you mean by "try something immediate the two problem sizes"?
>
> best regards,
> FG
>
>
> Lars Arvestad wrote:
>   
>> The alignment does not play in here. The -c option decides how many 
>> discretization points you put in the species tree, and these points are 
>> where gene duplications in the reconciliation are allowed to be placed. 
>> With large gene trees, there is a larger need for discretization points 
>> in the species tree. Also, if there are edges in the species tree that 
>> are short, then you have to ask for more discretization points in order 
>> to get enough points "placed" on the short species edges. This latter 
>> issue is something we have a student working on (in addition to other 
>> improvements and general speedups).
>>
>> My recommendation would be to try something immediate the two problem 
>> sizes you have now.
>>
>> How has the smaller dataset worked out for you?
>>
>>     Best,
>>     Lars
>>
>>
>> Filipe Garrett wrote:
>>     
>>> Dear Lars,
>>>
>>> As I had already told you, I am working with two datasets: one with 90 and the other with 375 sequences.
>>> With the smallest one I have no problem when I use a scaled tree to a root of 1.0 but, with the 
>>> other, even with the scaled tree, I have to increase the "-c" option a lot.
>>> While the first dataset works with a "-c 90", the second needs at least a "-c 650"! I've only tried 
>>> until "-c 650" since, at these level, primeGSR already uses over 33Gb of memory and still outputs 
>>> the error message:
>>>
>>> "
>>> Error:
>>>        Probability: Division with zero attempted!
>>> "
>>>
>>>
>>> Is this normal and is there a way to reduce memory usage?
>>> Does the quality of the alignment has any influence in running time and memory usage? Does the 
>>> alignment quality has a great influence on the birth and death rates estimates? I'm asking these 
>>> because I'm working with a rather divergent gene family and the alignment is likely to have a lot of 
>>> gaps (automatic alignment with no manual revision). I'm mainly interested on the birth and death 
>>> rates estimates...
>>>
>>> thanks for your time,
>>> FG
>>>
>>>   
>>>       
>> _______________________________________________
>> Primeusers mailing list
>> Primeusers at sbc.su.se
>> https://mail.sbc.su.se/mailman/listinfo/primeusers
>>
>>     
>
>