G'day everyone
I just need some advice on how to process two sets of data.
Question 1
I have two lists (list 1 and list 2) that I want to convert to three other lists (lists A, B and C). The data, by the way, is simply file names (one file name per line). List 1 is an old version of the data and list 2 is a new version of the data.
I want to create three lists, namely: a list that contains only data that is unique to list 1 (call it List A), and a list that contains only data that is unique to list 2 (call it list B), and a list that contains only the data that occurs in both list 1 and list 2 (call it list C).
I must confess that I'm pretty much a kindergarten programmer, so the way I would do it is one step at a time: i.e. create list A by comparing each line from list 1 with each line from list 2, one line at a time. I can then do the same thing all over again for list B, and all over again for list C.
However, I expect these lists to be very large (millions of lines), and I want to know if there is a quicker way of doing it. Specifically, I want to know if there is a way to create all three lists A, B and C in a single step.
So, the first question is: Will it be significantly faster to use a process that eats list 1 and 2 and spits out list A, B and C in one step? And if so, can you tell me what that method may be (or point me in some direction), please?
Question 2
If it turns out that creating the three lists one at a time is going to be no slower than some magical method of creating all three lists at the same time, my next question is how to create list C (the one that contains only data that occur in both list 1 and list 2).
One method I can think of is to simply merge both list 1 and list 2 into a temporary list (say, list D), and then create another temorary list (list E) in which the duplicates from list D was removed (using _ArrayUnique), and then use StringReplace with each line on list E to count whether it occurs once or twice in list D, and if twice, write it to list C... one line at a time.
I'm sure this must look really silly -- there must be a simpler, more magical way.
So, the second question is: Do you know of a simple way to compare two arrays and create a third array that contains only data that occurs in both those arrays?
Thanks
Samuel
I just need some advice on how to process two sets of data.
Question 1
I have two lists (list 1 and list 2) that I want to convert to three other lists (lists A, B and C). The data, by the way, is simply file names (one file name per line). List 1 is an old version of the data and list 2 is a new version of the data.
I want to create three lists, namely: a list that contains only data that is unique to list 1 (call it List A), and a list that contains only data that is unique to list 2 (call it list B), and a list that contains only the data that occurs in both list 1 and list 2 (call it list C).
I must confess that I'm pretty much a kindergarten programmer, so the way I would do it is one step at a time: i.e. create list A by comparing each line from list 1 with each line from list 2, one line at a time. I can then do the same thing all over again for list B, and all over again for list C.
[ autoit ]
; $list1 and $list2 are the two original lists, i.e. lines of data in a string ; $listA is the newly created list with data that is unique to $list1 $list1array = StringSplit ($list1, @CRLF, 1) For $i = 1 to $list1array[0] If NOT StringInStr ($list2, $list1array[$i]) Then $listA = $listA & @CRLF & $list1array[$i] EndIf Next
However, I expect these lists to be very large (millions of lines), and I want to know if there is a quicker way of doing it. Specifically, I want to know if there is a way to create all three lists A, B and C in a single step.
So, the first question is: Will it be significantly faster to use a process that eats list 1 and 2 and spits out list A, B and C in one step? And if so, can you tell me what that method may be (or point me in some direction), please?
Question 2
If it turns out that creating the three lists one at a time is going to be no slower than some magical method of creating all three lists at the same time, my next question is how to create list C (the one that contains only data that occur in both list 1 and list 2).
One method I can think of is to simply merge both list 1 and list 2 into a temporary list (say, list D), and then create another temorary list (list E) in which the duplicates from list D was removed (using _ArrayUnique), and then use StringReplace with each line on list E to count whether it occurs once or twice in list D, and if twice, write it to list C... one line at a time.
[ autoit ]
; $list1 and $list2 are the two original lists, i.e. lines of data in a string ; $listC will be the final list with only data that occurs in both $list1 and $list2 ; $listD is the temporary list with all entries in it ; $listE is the temporary list with only unique entries from $listD $listD = $list1 & @CRLF & $list2 ; I expect this would be necessary: ; While StringInStr ($listD, @CRLF & @CRLF) Then ; $listD = StringReplace ($listD, @CRLF & @CRLF, @CRLF) ; WEnd $listDarray = StringSplit ($listD, @CRLF, 1) $listEarray = _ArrayUnique ($listDarray) ; $listE = _ArrayToString ($listEarray, @CRLF) For $i = 1 to $listEarray[0] $j = StringReplace ($listD, $listEarray[$i], $listEarray[$i]) If @extended = 2 Then $listC = $listC & @CRLF & $listEarray[$i] EndIf Next
I'm sure this must look really silly -- there must be a simpler, more magical way.
So, the second question is: Do you know of a simple way to compare two arrays and create a third array that contains only data that occurs in both those arrays?
Thanks
Samuel