[Primeusers] Release 1.0 of PrimeGSR

Thu Feb 12 20:08:10 CET 2009

Scaling the tree to 1.0 at the root apparently works (still running) with both 90 ("-c 200") and 375 
("-c 400") sequences; although with 375 sequences needs too much memory (around 11Gb).

Changing the "-c" option greatly influences memory usage and speed. I've also tried a scaled tree 
with branch lengths roughly half of the branches of the 1.0 root tree, and it only works with "-c" 
equal to at least 500. Is it normal that a reduction in branch lengths to half leads to such an 
increase in memory usage and time (from 700Mb to 4Gb and approx. 10x slower)??

FG

Lars Arvestad wrote:
> What scale did you try? If you haven't already, try scaling the tree so 
> the root is at 1.0.
> 
>     Lars
> 
> 
> Filipe Garrett wrote:
>> I've just noticed that I've been using a larger species tree (with 
>> some species that have no genes in the MSA file) but I don't think 
>> that was a problem since the error persisted after correcting the tree.
>> I'm sending you the tree I've used now with 11 species; the distances 
>> are in millions of years.
>> I later thought that the large tree branches (in million of years) 
>> could be a problem but reducing branch length didn't solve the errors.
>>
>> FG
>>
>>
>>
>> Lars Arvestad wrote:
>>> Would you mind sending me the species tree? It would be nice to try 
>>> to understand it a bit better. I don't think 90 sequences should be a 
>>> problem, but we have not tried the program with 375 sequences. If the 
>>> branches in the species tree are very short, then our discretization 
>>> approach might not work very well.
>>>
>>>     Lars
>>>
>>>
>>>
>>> Filipe Garrett wrote:
>>>> Dear Lars,
>>>>
>>>> I've tried the "-c 50" option but still the same error.
>>>> I tried other values of "-c" ranging from 50 to 1000 and all values 
>>>> up to "-c 800" had the same error:
>>>> "
>>>> Error:
>>>>      Probability: Division with zero attempted!
>>>> "
>>>>
>>>> The only case where the program seemed to be working was for "-c 
>>>> 850" but it used way too much memory: around 11Gb!!!
>>>>
>>>> I suppose that's why when I used "-c 900" and "-c 1000" a different 
>>>> error appeared, probably due to memory limitations:
>>>> "
>>>> ...
>>>> # start init fastGEM_BirthDeathProbs
>>>> # end init fastGEM_BirthDeathProbs
>>>>   Exception
>>>> St9bad_alloc
>>>> "
>>>>
>>>> At this point I think I may have a case of too many 
>>>> sequences/species or low memory. I'm running primeGSR on a PC Linux 
>>>> machine with 12Gb of RAM and I have 90 sequences from 21 species 
>>>> (also have another dataset with 375 sequences from 21 species).
>>>> What do you think?
>>>>
>>>> thanks a lot for your time,
>>>> FG
>>>>
>>>>
>>>> Lars Arvestad wrote:
>>>>  
>>>>> Hi Filipe,
>>>>> Please try the option "-c 50", and if the problem persists, double 
>>>>> the number. This option decides how many discretization points we 
>>>>> will use in the speciation tree. If there are too few points, there 
>>>>> will be edges in the species tree where we can't put down a gene 
>>>>> tree node, even if we need to. Unfortunately, we do not detect and 
>>>>> fix this properly yet.
>>>>>
>>>>>     Best regards,
>>>>>     Lars
>>>>>
>>>>>
>>>>> Filipe Garrett skrev:
>>>>>    
>>>>>> Dear Lars,
>>>>>>
>>>>>> Despite not knowing what it was, I've been a able to fix the 
>>>>>> previous error ("not a valid alphabet state") by remaking the 
>>>>>> input files (probably some typo).
>>>>>> However a new error occurs:
>>>>>> "
>>>>>> Error:
>>>>>>      Probability: Division with zero attempted!
>>>>>> "
>>>>>>
>>>>>> Any idea of what may be?
>>>>>> thanks in adv,
>>>>>> FG
>>>>>>
>>>>>>
>>>>>> Lars Arvestad wrote:
>>>>>>        
>>>>>>>> I've just installed primeGSR and had some problems. It couldn't 
>>>>>>>> find the "g2f" library. I removed it from the Makefile and it 
>>>>>>>> compiled without errors. Run the sample files and everything 
>>>>>>>> perfect (at least I think since I don't know if the output is 
>>>>>>>> correct). What is the function of this library? Is it normal to 
>>>>>>>> compile without it?
>>>>>>>>                   
>>>>>>> Do you mean "g2c"? This used to be needed when linking the linear 
>>>>>>> algebra libraries BLAS and LAPACK. Maybe this need has 
>>>>>>> disappeared in modern systems. I will remove it from the makefile!
>>>>>>>
>>>>>>>            
>>>>>>>> Then I was trying primeGSR and managed to input everything 
>>>>>>>> correctly (I think). However an error occurred stating:
>>>>>>>>
>>>>>>>> "
>>>>>>>> the state  is notError:
>>>>>>>>      not a valid alphabet state
>>>>>>>> "
>>>>>>>>
>>>>>>>> I'm using this command line on a PC Fedora Core 7 Linux:
>>>>>>>> primeGSR -d -o output.mcmc -Bp 0.1 0.1 -Ed Gamma -i 1000 -t 10 
>>>>>>>> -Sm JTT -Hi tree.nwk in_seq.fas in_seq.gs
>>>>>>>>                 
>>>>>>> This bad error message is due to finding an 
>>>>>>> unrecognized/unsupported character in the sequence data. If you 
>>>>>>> have trouble finding the offending sequence, send it to me and I 
>>>>>>> will take a look at it.
>>>>>>>
>>>>>>> Thanks for the bug reports!
>>>>>>> Lasse
>>>>>>>
>>>>>>>             
>>>>>>         
>>>>> _______________________________________________
>>>>> Primeusers mailing list
>>>>> Primeusers at sbc.su.se
>>>>> https://mail.sbc.su.se/mailman/listinfo/primeusers
>>>>>
>>>>>     
>>>>   
>>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Primeusers mailing list
>> Primeusers at sbc.su.se
>> https://mail.sbc.su.se/mailman/listinfo/primeusers
> 
> 

-- 
Filipe G. Vieira
Departament de Genetica
Universitat de Barcelona
Av. Diagonal, 645
08028 Barcelona
SPAIN
Phone: +34 934 035 306
Fax: +34 934 034 420
fgarret at ub.edu
http://www.ub.edu/molevol/