[help] there is a problem with duplicate elements in Filter in JAVA8 List < Map < > >.

problem description

List < Map > data structure as follows:

    List<Map<String, Object>> list = new ArrayList<>();
    
    Map<String, Object> map1 = new HashMap<>();
    map1.put("order_no", "123");
    map1.put("quantity", 10);
    map1.put("amount", 100);
    
    Map<String, Object> map2 = new HashMap<>();
    map2.put("order_no", "223");
    map2.put("quantity", 15);
    map2.put("amount", 150);
    
    Map<String, Object> map3 = new HashMap<>();
    map3.put("order_no", "123");
    map3.put("quantity", 5);
    map3.put("amount", 50);
    
    Map<String, Object> map4 = new HashMap<>();
    map4.put("order_no", "124");
    map4.put("quantity", 6);
    map4.put("amount", 60);
    
    Map<String, Object> map5 = new HashMap<>();
    map5.put("order_no", "223");
    map5.put("quantity", 7);
    map5.put("amount", 70);
    
    list.add(map1);
    list.add(map2);
    list.add(map3);
    list.add(map4);
    list.add(map5);

there is a requirement to judge whether there are duplicates of Map.key=order_no, and its value in the above list < Map >, and take out the duplicate items. As shown in the example, we should finally catch the two orders of order_no=123,223,. My current way of writing is:

    //list2  list
    List<Map<String, Object>> list2 = new ArrayList<>();
    list2.addAll(list);
    
    List<Map<String, Object>> collect = list.stream().filter(x->{
        long count = list2.stream().filter(x2->x2.get("order_no").equals(x.get("order_no"))).count();
        if(count>1) {  //
            return true;
        }
        return false;
    }).collect(Collectors.groupingBy(x->x.get("order_no"))).entrySet().stream().map(x->{
        Map<String, Object> tmp = new HashMap<>();
        tmp.put("key_order", x.getKey());
        tmp.put("order_list", x.getValue());
        return tmp;  //
    }).collect(Collectors.toList());

although the function is realized at present, considering that there are tens of thousands or more orders, redefining the same transition with list is rough and inefficient. I would like to ask you if there is a more concise, efficient and elegant way to achieve the function?

Jan.08,2022

The

Java8 Stream itself provides a .duplicate () method for de-duplicating based on key, but it doesn't provide a method for de-duplicating based on value, so we have to write an extension for him ourselves.

import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.function.Function;
import java.util.function.Predicate;

/**
 * @author . created in 2018/12/01 00:01
 */
public class StreamEx {

    public static <T> Predicate<T> distinctByKey(Function<? super T, Object> keyExtractor) {
        Map<Object, Boolean> seen = new ConcurrentHashMap<>();
        return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
    }
}

Test

list
                .stream()
                .filter(StreamEx.distinctByKey(x -> x.get("order_no")))
                .forEach(x -> {
                    System.out.println(x.toString());
                });

//        {order_no=123, amount=100, quantity=10}
//        {order_no=223, amount=150, quantity=15}
//        {order_no=124, amount=60, quantity=6}

you can see that the above code prints the deduplicated data information.

only need

in StreamEx
return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
Change

to

return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) != null;

can meet your requirements.


. Why don't you just use set to check and repeat it? just cycle through it. This is too scary.

        Set<String> set = new HashSet<>();
        Map<String,List<Map<String,Object>>> valMap = new HashMap<>();
        for(Map<String,Object> item:list){
            String id = item.get("order_no").toString();
            if(set.contains(id)){
                List<Map<String, Object>> l = valMap.computeIfAbsent(id, k -> new ArrayList<>());
                l.add(item);
            }
            set.add(id);
        }

        for(Map.Entry<String,List<Map<String,Object>>> entry:valMap.entrySet()){
            System.out.println(JSON.toJSONString(entry.getValue()));
        }
        // print
        // [{"order_no":"123","amount":50,"quantity":5}]
        // [{"order_no":"223","amount":70,"quantity":7}]

this is simple. When list adds an element, it is indexed by a Map. Key is the map, of the element order_no,value. Because there is only one reference, there is no big spatial problem. At the same time, you can use the o (1) query ability of map

.
MySQL Query : SELECT * FROM `codeshelper`.`v9_news` WHERE status=99 AND catid='6' ORDER BY rand() LIMIT 5
MySQL Error : Disk full (/tmp/#sql-temptable-64f5-1ea8d04-1f0c.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
MySQL Errno : 1021
Message : Disk full (/tmp/#sql-temptable-64f5-1ea8d04-1f0c.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
Need Help?