[Primeusers] Problem running primeGSR

Lars Arvestad arve at csc.kth.se
Tue Jun 15 16:17:42 CEST 2010


OK, the PrIME-GSR download page,
	http://prime.sbc.su.se/primeGSR/download.html
now has a section with a link to a preview version of a faster version tentatively called PrimeGSRf.

I have had to play around with the values on some new options regarding discretization to get a larger testcase working:

  -Dt <float>
      Approximate discretization timestep. Set to 0 to divide every edge in equally
      many parts (see -Di). Defaults to 0.05. See -Dtt for edge above root.
  -Di <unsigned int>
      Minimum number of parts to slice each edge in. If -Dt is set to 0, this becomes
      the exact number of parts. Minimum 2. Defaults to 3. See -Dtt for edge above root.
  -Dtt <unsigned int>
      Override number of discretization points for edge above root in host tree.
      By default, irrespective of time span, this is set to the number
      of points for a (hypothetical) root-to-leaf edge.

Let us know if you have problems!

	Lars

--
Computational Biology at School of Computer Science and Communication
and Stockholm Bioinformatics Center
http://www.csc.kth.se/~arve





Jun 14, 2010 kl. 4:00 PM skrev Albert Vilella:

> That would be grat, thanks Lars and Joel. We've got many cases where
> the branch length is very short. Each Ensembl production produces
> ~20.000 trees, out of which about ~10.000 have a good number of
> sequences with a variety of branch lengths.
> 
> Any method that would solve this problem for this dataset would be
> great to be tried. Apart from the example I pointed below, if you want
> a bunch of other examples you can use our data dumps:
> 
> ftp://ftp.ensembl.org/pub/current_emf/ensembl-compara/homologies/Compara.gene_trees.58.cds.fasta.gz
> 
> But I'll also try it myself once I get hold of the code.
> 
> On Mon, Jun 14, 2010 at 1:17 PM, Lars Arvestad <arve at csc.kth.se> wrote:
>> 
>> This is a problem which occurs due to the discretization of the species tree that our method needs. Two issues seems to appear. Either there is a branch in the species tree which is very short and we don't get enough (or any!) discretization points on the edge, making it impossible to put a putative duplication there, or the input sequences are so many that the discretization you have asked for is not enough on some parts of the species tree.
>> 
>> The solution is to ask for more discretization points, but that incurs a bad slowdown. Our student Joel Sjöstrand has worked on improving the handling of discretization and algorithmic improvements to avoid massive slowdowns, but we are not done testing them. In the tests I have run, one still has to try out some different parameter settings on "tough" data.
>> 
>> We can offer a binary for those of you whom are interested in trying this faster version of Prime-GSR on linux?
>> 
>>        Best regards,
>>        Lars
>> 
>> --
>> Computational Biology at School of Computer Science and Communication
>> and Stockholm Bioinformatics Center
>> http://www.csc.kth.se/~arve
>> 
>> 
>> 
>> 
>> 
>> Jun 10, 2010 kl. 2:49 PM skrev Albert Vilella:
>> 
>>> I am also having problems with "Division by zero" when running primeGSR:
>>> Here is an example dataset that doesn't work for me:
>>> http://www.ebi.ac.uk/~avilella/primegsr_test/
>>> 
>>> eg
>>> 
>>> node_id=88969.pg
>>> /nfs/users/nfs_a/avilella/src/primegsr/latest/PrimeGSR_1.0/primeGSR -o
>>> $node_id.mcmc -i 1000000 -t 100 -Sm JTT -Bp 0.1 0.1 -Bt 1.0  -Ed Gamma
>>> -Hi test_ultrametric_ape_1.nh $node_id.fasta $node_id.gsmap
>>> 
>>> On Thu, Jun 10, 2010 at 12:27 PM, Jacky Hess <jacky at ebi.ac.uk> wrote:
>>>> Hi,
>>>> 
>>>> I hadn't managed to find a way around it at the time but I am starting
>>>> to look at it again and would be interested to hear if anyone found a
>>>> solution to this as well.
>>>> 
>>>> Best,
>>>> Jacky
>>>> 
>>>> On 08/04/2010 12:18, James Cotton wrote:
>>>>> Hi,
>>>>> 
>>>>> We've finally got primeGSR running on our data, but we now seem to
>>>>> have an MCMC that does not move at all, so the posterior for every
>>>>> parameter consists of a single value, and only a single tree is sampled.
>>>>> 
>>>>> We get error messages that are the same as those reported by Jacky
>>>>> Hess on Tue Sep 29 11:38:56 CEST 2009 - lots of 'Tried to set length
>>>>> of node' messages.
>>>>> 
>>>>> I'm wondering if Jacky or anyone else on the forum has managed to fix
>>>>> this kind of problem, and if so, how!
>>>>> 
>>>>> Thanks
>>>>> James Cotton
>>>>> '
>>>>> _____________________________________________
>>>>> James Cotton
>>>>> School of Biological and Chemical Sciences
>>>>> Queen Mary, University of London
>>>>> +44 (0)207 882 3645
>>>>> j.a.cotton at qmul.ac.uk
>>>>> http://webspace.qmul.ac.uk/jacotton/index.html
>>>>> http://www.sbcs.qmul.ac.uk/staff/jamescotton.html
>>>>> _____________________________________________
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> Primeusers mailing list
>>>>> Primeusers at sbc.su.se
>>>>> https://mail.sbc.su.se/mailman/listinfo/primeusers
>>>>> 
>>>> 
>>>> _______________________________________________
>>>> Primeusers mailing list
>>>> Primeusers at sbc.su.se
>>>> https://mail.sbc.su.se/mailman/listinfo/primeusers
>>>> 
>>> _______________________________________________
>>> Primeusers mailing list
>>> Primeusers at sbc.su.se
>>> https://mail.sbc.su.se/mailman/listinfo/primeusers
>> 
>> _______________________________________________
>> Primeusers mailing list
>> Primeusers at sbc.su.se
>> https://mail.sbc.su.se/mailman/listinfo/primeusers
>> 
> _______________________________________________
> Primeusers mailing list
> Primeusers at sbc.su.se
> https://mail.sbc.su.se/mailman/listinfo/primeusers

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mail.sbc.su.se/pipermail/primeusers/attachments/20100615/72c1f3cf/attachment.html 


More information about the Primeusers mailing list