pyspark - Related information

  • Why does pyspark have a problem with the same sentence on jupyter notebook, but on pycharm?

    In the Linux environment, I run categoriMap = train.map(lambdax:x[3]).distinct().zipWithIndex().collectAsMap() In such a sentence, there is no error in the normal operation of jupyter notebook, but not on pycharm. I would like to ask why the wrong cont...

    Jul.07,2021
  • How pyspark modifies the value of a column in Dataframe

    the data value is like this < table > < thead > < tr > < th > Survived < th > < th > age < th > < tr > < thead > < tbody > < tr > < td > 0 < td > < td > 22.0 < td > < tr > < tr > < td > 1 < td > < td > 38.0 < td > < tr > < tr > < td...

    Apr.30,2021
  • How can two large pieces of data in spark avoid shuffle in join?

    purpose: there are two large pieces of data in spark that require join,. Both input data contain the field userid. Now you need to associate them according to userid. I hope to avoid shuffle. completed: I pre-processed two pieces of data into 1w f...

  • Why did pyspark fail to call python third-party libraries in RDD?

    problem description Hi, I called the jieba participle when I was running pyspark on the company line, and found that I could successfully import, but when I called the participle function in RDD, it suggested that there was no module jieba, without th...

    Mar.28,2021
  • MySQL Query : SELECT * FROM `codeshelper`.`v9_news` WHERE status=99 AND catid='6' ORDER BY rand() LIMIT 5
    MySQL Error : Disk full (/tmp/#sql-temptable-64f5-1be22a8-45b68.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
    MySQL Errno : 1021
    Message : Disk full (/tmp/#sql-temptable-64f5-1be22a8-45b68.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
    Need Help?