Talk:A quick string randomizer

From BioPerl
Jump to: navigation, search

A few quick tests seem to show that the last character in the submitted sequence never gets randomized. Also, I am not convinced that the sort stage is necessary, and there seems to be some unecessary copying between vectors and maps.

I think the following:

sub perm2 {

   my @a = split(,shift);
   my @i = (0..$#a);
   my @j = @i;
   my @r;
   map {
       push @r, splice(@i, rand($_+1),1);
   } reverse @j;
   return join(,@a[@r]);
}

acheives the same end, and runs 6000 times faster for a 1000 length sequence.

  • Thanks, Nigel--You get pride of place, see the page. cheers --Majensen 04:13, 15 February 2009 (UTC)

I woke up this morning with a new algorithm, which is more efficient again.


sub perm3 {
my @a = split('',shift);
my @i = (0..$#a);
my @r;
for ($j = $#a; $j >= 0; $j--){
    my $entry = rand($j+1);
    push @r, $i[$entry];
    $i[$entry]=$i[$j];
    };
return join('',@a[@r]);
}


I used this algorithm not because I wanted to randomize strings, but because I wanted a short 'random' algorithm to reimplement in C++ to compare the relative efficiency of Perl and C++. The results are intriguing.

[1]

C++ is between 1.3 and 150 times more efficient, depending on the algorithm and string size. In each case I implemented as close as possible to the same algorithm in C++ and Perl.

--Nigel 08:55, 15 February 2009 (UTC)

  • Interesting- I'm not too surprised that as the Perl becomes more 'C-ified' the efficiency goes up. I like perm2 aesthetically as it is more idiomatically Perl. Do feel free to attach your versions to the Scrap pages directly. As long as the originals are preserved, to get a sense of the evolution, I think the more versions the merrier. [I chucked my algorithm due to the bug you observed...] cheers --Majensen 16:06, 15 February 2009 (UTC)
Personal tools
Namespaces
Variants
Actions
Main Links
documentation
community
development
Toolbox