1. When I am converting the real data of two dataframe tables into numpy data for column content comparison, I encounter the following situations:
input:
individual elements of different arrays
d12.values [0] [0],
d12.values [1] [0],
d11.values [0] [1],
d11.values [1] [1],
use the elements of one array to make a vector comparison with the other array columns
d12.values [0] [0] = = d11.values [:, 1]
Output:
(ubegu6db2UniqpU7cbeUnixU7c89spiral,
ufu6db2Uniqu7cbeUnixu7c89spiral,
upendu6db2Unixu7cbeUnixu7c89rabbit,
upendu6db2qu7cbeUnigram u7c89cards,
False)
False)
input:
d12.values [0] [0] = = d11.values [0] [1],
d12.values [0] [0] = = d11.values [1] [1]
Output:
(True, True)
some elements in columns in different arrays are the same when compared individually, but all False is returned when comparing a single element with other array vectors.
but it is normal to compare the test data built separately by ourselves:
input:
D1 = pd.DataFrame ([["Xerogramme," c01,"A", "1010"], ["x","X", "1020"], ["x","x", "1020"], ["x", "1010"], ["x", "c02", "1010"], ["y"], ["1020"], ["YYZ"], ["1020"], ["yellows"], ["1020"], ["1020"], ["1020"], ["1020"], ["1020"], ["1020"] ], columns = list ("abcd"))
D1
output:
input:
d2 = pd.DataFrame([["x","20001","530cm"],["x","20002","150cm"],["x","20003","340cm"],["y","20004","10"],["y","20005","30"],["z","20006","100"],["z","20007","200"],["z","20008","300"]],columns = list("aef"))
d2
ouput:
input:
d2.values [0] [0] = = d1.values [:, 0]
output:
array ([True, False, False, False], dtype=bool)
question: why do one-to-one comparisons are the same and return all False results in one-to-many (the entire array)?
add that this kind of error was also encountered in the previous comparison of real data (the test data was not encountered):
DVV Anaconda2libsitelypackagesipykernelinstalled mainstance.pyriza 8: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode-interpreting them as being unequal
after setting:
import sys
reload (sys)
sys.setdefaultencoding ("utf8")
error report is not available, but the above problem occurs.