+ 6

# [Assignment] [Challenge]: Longuest common substring

Given X and Y two strings, build a function C=S(X,Y) so that C is a subsequence of X and Y and C is the longuest (you can ignore cases where there are many of them) # Tests (X,Y),C => f(X,Y) should equal C ("azertyuiop","zwwxuiop"),"zuiop" ("aaaataaaazbaazzz","yitafgabjjazebza"),"taaazbz" ("12345","aeetyyy"),"" ("a2z3e4r5t6zz","a234562ertzz"),"a23ertzz" Bigger test case here : https://code.sololearn.com/ceO4P41uQE7E/#py

14 ответов

+ 19

https://code.sololearn.com/c0rMCRCnPRCP/?ref=app

+ 5

@VcC I didn't want to spoil it to let the others find out their own way, but if you ask...
Here, it is a recursive one together with some unnecessary optimizations (as usual well commented):
https://code.sololearn.com/cpsKMKAAHnS3/?ref=app
However, I have just got an idea, how it can be done even easier by realization with about the same time complexity and less memory complexity. I will try to code it later...

+ 4

I think people try to solve then look at others code... Mine same as yours.
https://code.sololearn.com/csH8yT3oNfBJ/?ref=app

+ 3

@VcC One more super cool task!
It took me for the almost half of hour to solve it properly.
A brute force solution will have a huge complexity of about O(2^(M+N)) by time and O(M*2^M+N*2^N) by memory, where M and N - lengths of input strings. However, a smarter approach allows to reduce it (if I estimate it correctly) up to O(2^max(M,N)) by time and O(M+N) by memory.

+ 3

Cool. Show us your solution ! As i am concerned i focused on doing it in one line as short as possible :-)

+ 3

Bigger test case if you want to see how efficient your solution is:
https://code.sololearn.com/ceO4P41uQE7E/#py

+ 3

Since nobody else didn't want to demonstrate the solution based on dynamic programming with memoization approach, I have done it by my own:
https://code.sololearn.com/c7W8OV49STkF/?ref=app
Its time and memory complexities should be both equal to O(N*M), where M and N - lengths of input strings.
However, I'm not the expert in a field, so feel free to test it and inform me about any bugs that you have met!

+ 3

I have found some time to perform several time measurements.
PART 1: My recursive solution and its optimizations
PART 2: Comparision to @VcC recursive solution
PART 3: Notes about dynamic programming solutions
To test recursive solutions I took 2 examples:
1. ("aaaataaaazbaazzz","yitafgabjjazebza") -> "taaazbz"
2. ("a2z3e4r5t6zz","a234562ertzz") -> "a23ertzz"
All times are given in relative units.
##### PART 1 #####
I test 2 optimizations that I introduced in my code.
1. First one checks if two input strings are equal. If so, there is no need to continue recursion, just this string can be returned. It is useful for long strings, which have similar symbols at their ends.
2. Second optimization: first string (from which symbols are taken) is always shorter (or equal by length) than second string (in which symbols are searched).
Test 1:
w/o optimizations: 1x
1 optimization: 0.9x
2 optimization: 0.45x
both optimizations: 0.45x
Test 2:
w/o optimizations: 1x
1 optimization: 0.65x
2 optimization: 0.98x
both optimizations: 0.73x
We can see that for first example second optimization helps to gain more than twice! First optimization is also useful when used independently but doesn't gain more together with the second one.
And it is completely different for the other test case. First optimization is much more useful that the second one. And when used together second optimization only slows down the solution!
##### PART 2 #####
Comparision of 2 solutions from me (with both optimizations) and @VcC also shows interesting results!
Test 1:
@Nevfy: 1x
@VcC: 0.54x
Test 2:
@Nevfy: 1x
@VcC: 1.9x
We see that each solution works better for specific tests.
##### PART 3 #####
I have found that my and @VcC solutions for dynamic programming approach are almost identical. However, there is a major thing that allows VcC's solution to be almost two times quicker. He uses build-in Python lists, when I use NumPy arrays. So you should choose your data structures wisely!

+ 2

Agree ! Updated test cases

+ 2

Nevfy Nice analysis. The rule for python efficiency is : find the good algorithm, use the efficient datastructure and prebuilt functions and leave the smallest possible work to the slow python interpreter.
One reason i m trying to oneline any problem - make me find new ways to use python possibilities while using the interpreter as less as possible

+ 2

#what u hv given as eg: is not substring
..
its substring of all possible permutations
azertyuiop
zwwxuiop
for substring... ans shud be uiop
zuiop is not a substring or subsequence of either two...

+ 1

test cases unclear. in case 2, the subsequence "taaazba" can be formed from both, and in case 3 the same can applied to the subsequence "a23ertzz"