[Primeusers] Release 1.0 of PrimeGSR
Filipe Garrett
fgarret at ub.edu
Thu Feb 12 20:08:10 CET 2009
Scaling the tree to 1.0 at the root apparently works (still running) with both 90 ("-c 200") and 375
("-c 400") sequences; although with 375 sequences needs too much memory (around 11Gb).
Changing the "-c" option greatly influences memory usage and speed. I've also tried a scaled tree
with branch lengths roughly half of the branches of the 1.0 root tree, and it only works with "-c"
equal to at least 500. Is it normal that a reduction in branch lengths to half leads to such an
increase in memory usage and time (from 700Mb to 4Gb and approx. 10x slower)??
FG
Lars Arvestad wrote:
> What scale did you try? If you haven't already, try scaling the tree so
> the root is at 1.0.
>
> Lars
>
>
> Filipe Garrett wrote:
>> I've just noticed that I've been using a larger species tree (with
>> some species that have no genes in the MSA file) but I don't think
>> that was a problem since the error persisted after correcting the tree.
>> I'm sending you the tree I've used now with 11 species; the distances
>> are in millions of years.
>> I later thought that the large tree branches (in million of years)
>> could be a problem but reducing branch length didn't solve the errors.
>>
>> FG
>>
>>
>>
>> Lars Arvestad wrote:
>>> Would you mind sending me the species tree? It would be nice to try
>>> to understand it a bit better. I don't think 90 sequences should be a
>>> problem, but we have not tried the program with 375 sequences. If the
>>> branches in the species tree are very short, then our discretization
>>> approach might not work very well.
>>>
>>> Lars
>>>
>>>
>>>
>>> Filipe Garrett wrote:
>>>> Dear Lars,
>>>>
>>>> I've tried the "-c 50" option but still the same error.
>>>> I tried other values of "-c" ranging from 50 to 1000 and all values
>>>> up to "-c 800" had the same error:
>>>> "
>>>> Error:
>>>> Probability: Division with zero attempted!
>>>> "
>>>>
>>>> The only case where the program seemed to be working was for "-c
>>>> 850" but it used way too much memory: around 11Gb!!!
>>>>
>>>> I suppose that's why when I used "-c 900" and "-c 1000" a different
>>>> error appeared, probably due to memory limitations:
>>>> "
>>>> ...
>>>> # start init fastGEM_BirthDeathProbs
>>>> # end init fastGEM_BirthDeathProbs
>>>> Exception
>>>> St9bad_alloc
>>>> "
>>>>
>>>> At this point I think I may have a case of too many
>>>> sequences/species or low memory. I'm running primeGSR on a PC Linux
>>>> machine with 12Gb of RAM and I have 90 sequences from 21 species
>>>> (also have another dataset with 375 sequences from 21 species).
>>>> What do you think?
>>>>
>>>> thanks a lot for your time,
>>>> FG
>>>>
>>>>
>>>> Lars Arvestad wrote:
>>>>
>>>>> Hi Filipe,
>>>>> Please try the option "-c 50", and if the problem persists, double
>>>>> the number. This option decides how many discretization points we
>>>>> will use in the speciation tree. If there are too few points, there
>>>>> will be edges in the species tree where we can't put down a gene
>>>>> tree node, even if we need to. Unfortunately, we do not detect and
>>>>> fix this properly yet.
>>>>>
>>>>> Best regards,
>>>>> Lars
>>>>>
>>>>>
>>>>> Filipe Garrett skrev:
>>>>>
>>>>>> Dear Lars,
>>>>>>
>>>>>> Despite not knowing what it was, I've been a able to fix the
>>>>>> previous error ("not a valid alphabet state") by remaking the
>>>>>> input files (probably some typo).
>>>>>> However a new error occurs:
>>>>>> "
>>>>>> Error:
>>>>>> Probability: Division with zero attempted!
>>>>>> "
>>>>>>
>>>>>> Any idea of what may be?
>>>>>> thanks in adv,
>>>>>> FG
>>>>>>
>>>>>>
>>>>>> Lars Arvestad wrote:
>>>>>>
>>>>>>>> I've just installed primeGSR and had some problems. It couldn't
>>>>>>>> find the "g2f" library. I removed it from the Makefile and it
>>>>>>>> compiled without errors. Run the sample files and everything
>>>>>>>> perfect (at least I think since I don't know if the output is
>>>>>>>> correct). What is the function of this library? Is it normal to
>>>>>>>> compile without it?
>>>>>>>>
>>>>>>> Do you mean "g2c"? This used to be needed when linking the linear
>>>>>>> algebra libraries BLAS and LAPACK. Maybe this need has
>>>>>>> disappeared in modern systems. I will remove it from the makefile!
>>>>>>>
>>>>>>>
>>>>>>>> Then I was trying primeGSR and managed to input everything
>>>>>>>> correctly (I think). However an error occurred stating:
>>>>>>>>
>>>>>>>> "
>>>>>>>> the state is notError:
>>>>>>>> not a valid alphabet state
>>>>>>>> "
>>>>>>>>
>>>>>>>> I'm using this command line on a PC Fedora Core 7 Linux:
>>>>>>>> primeGSR -d -o output.mcmc -Bp 0.1 0.1 -Ed Gamma -i 1000 -t 10
>>>>>>>> -Sm JTT -Hi tree.nwk in_seq.fas in_seq.gs
>>>>>>>>
>>>>>>> This bad error message is due to finding an
>>>>>>> unrecognized/unsupported character in the sequence data. If you
>>>>>>> have trouble finding the offending sequence, send it to me and I
>>>>>>> will take a look at it.
>>>>>>>
>>>>>>> Thanks for the bug reports!
>>>>>>> Lasse
>>>>>>>
>>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> Primeusers mailing list
>>>>> Primeusers at sbc.su.se
>>>>> https://mail.sbc.su.se/mailman/listinfo/primeusers
>>>>>
>>>>>
>>>>
>>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Primeusers mailing list
>> Primeusers at sbc.su.se
>> https://mail.sbc.su.se/mailman/listinfo/primeusers
>
>
--
Filipe G. Vieira
Departament de Genetica
Universitat de Barcelona
Av. Diagonal, 645
08028 Barcelona
SPAIN
Phone: +34 934 035 306
Fax: +34 934 034 420
fgarret at ub.edu
http://www.ub.edu/molevol/
More information about the Primeusers
mailing list