성능 ConcurrentHashmap 대 HashMap
ConcurrentHashMap의 성능은 HashMap, 특히 .get () 작업과 비교하여 어떻습니까 (특히 0-5000 범위의 항목이 거의없는 경우에만 관심이 있습니다)?
HashMap 대신 ConcurrentHashMap을 사용하지 않는 이유가 있습니까?
(널 값이 허용되지 않는다는 것을 알고 있습니다)
최신 정보
명확히하기 위해 실제 동시 액세스의 경우 성능이 저하되지만 동시 액세스가없는 경우의 성능을 어떻게 비교합니까?
이 주제가 너무 오래되었다는 사실에 정말 놀랐지 만 아직 아무도 사례에 대한 테스트를 제공하지 않았습니다. 사용 ScalaMeter
나는의 테스트를 만들었습니다 add
, get
그리고 remove
모두 HashMap
와 ConcurrentHashMap
두 가지 시나리오에 :
- 단일 스레드 사용
- 사용 가능한 코어만큼 많은 스레드를 사용합니다. 스레드로부터 안전하지 않기 때문에 각 스레드에 대해
HashMap
별도로 생성HashMap
했지만 하나의 shared를 사용했습니다ConcurrentHashMap
.
내 저장소에서 코드를 사용할 수 있습니다 .
결과는 다음과 같습니다.
- X 축 (크기)은지도에 기록 된 요소의 수를 나타냅니다.
- Y 축 (값)은 시간을 밀리 초로 표시합니다.
요약
가능한 한 빨리 데이터에 대해 작업하려면 사용 가능한 모든 스레드를 사용하십시오. 각 스레드는 전체 작업의 1 / n을 수행해야합니다.
단일 스레드 액세스 사용을 선택
HashMap
하면 더 빠릅니다. 대한add
방법조차 많은 배와 같은 더 효율적입니다. 에서만get
빠르지 만ConcurrentHashMap
많지는 않습니다.When operating on
ConcurrentHashMap
with many threads it is similarly effective to operating on separateHashMaps
for each thread. So there is no need to partition your data in different structures.
To sum up, the performance for ConcurrentHashMap
is worse when you use with single thread, but adding more threads to do the work will definitely speed-up the process.
Testing platform
AMD FX6100, 16GB Ram
Xubuntu 16.04, Oracle JDK 8 update 91, Scala 2.11.8
Thread safety is a complex question. If you want to make an object thread safe, do it consciously, and document that choice. People who use your class will thank you if it is thread safe when it simplifies their usage, but they will curse you if an object that once was thread safe becomes not so in a future version. Thread safety, while really nice, is not just for Christmas!
So now to your question:
ConcurrentHashMap (at least in Sun's current implementation) works by dividing the underlying map into a number of separate buckets. Getting an element does not require any locking per se, but it does use atomic/volatile operations, which implies a memory barrier (potentially very costly, and interfering with other possible optimisations).
Even if all the overhead of atomic operations can be eliminated by the JIT compiler in a single-threaded case, there is still the overhead of deciding which of the buckets to look in - admittedly this is a relatively quick calculation, but nevertheless, it is impossible to eliminate.
As for deciding which implementation to use, the choice is probably simple.
If this is a static field, you almost certainly want to use ConcurrentHashMap, unless testing shows this is a real performance killer. Your class has different thread safety expectations from the instances of that class.
If this is a local variable, then chances are a HashMap is sufficient - unless you know that references to the object can leak out to another thread. By coding to the Map interface, you allow yourself to change it easily later if you discover a problem.
If this is an instance field, and the class hasn't been designed to be thread safe, then document it as not thread safe, and use a HashMap.
If you know that this instance field is the only reason the class isn't thread safe, and are willing to live with the restrictions that promising thread safety implies, then use ConcurrentHashMap, unless testing shows significant performance implications. In that case, you might consider allowing a user of the class to choose a thread safe version of the object somehow, perhaps by using a different factory method.
In either case, document the class as being thread safe (or conditionally thread safe) so people who use your class know they can use objects across multiple threads, and people who edit your class know that they must maintain thread safety in future.
I would recommend you measure it, since (for one reason) there may be some dependence on the hashing distribution of the particular objects you're storing.
The standard hashmap provides no concurrency protection whereas the concurrent hashmap does. Before it was available, you could wrap the hashmap to get thread safe access but this was coarse grain locking and meant all concurrent access got serialised which could really impact performance.
The concurrent hashmap uses lock stripping and only locks items that affected by a particular lock. If you're running on a modern vm such as hotspot, the vm will try and use lock biasing, coarsaning and ellision if possible so you'll only pay the penalty for the locks when you actually need it.
In summary, if your map is going to be accesaed by concurrent threads and you need to guarantee a consistent view of it's state, use the concurrent hashmap.
In the case of a 1000 element hash table using 10 locks for whole table saves close to half the time when 10000 threads are inserting and 10000 threads are deleting from it.
The interesting run time difference is here
Always use Concurrent data structure. except when the downside of striping (mentioned below) becomes a frequent operation. In that case you will have to acquire all the locks? I read that the best ways to do this is by recursion.
Lock striping is useful when there is a way of breaking a high contention lock into multiple locks without compromising data integrity. If this is possible or not should take some thought and is not always the case. The data structure is also the contributing factor to the decision. So if we use a large array for implementing a hash table, using a single lock for the entire hash table for synchronizing it will lead to threads sequentially accessing the data structure. If this is the same location on the hash table then it is necessary but, what if they are accessing the two extremes of the table.
The down side of lock striping is it is difficult to get the state of the data structure that is affected by striping. In the example the size of the table, or trying to list/enumerate the whole table may be cumbersome since we need to acquire all of the striped locks.
What answer are you expecting here?
It is obviously going to depend on the number of reads happening at the same time as writes and how long a normal map must be "locked" on a write operation in your app (and whether you would make use of the putIfAbsent
method on ConcurrentMap
). Any benchmark is going to be largely meaningless.
당신의 의미는 명확하지 않습니다. 스레드 안전성이 필요한 경우 선택의 여지가 거의 없으며 ConcurrentHashMap 만 있습니다. 그리고 그것은 확실히 get () 호출에서 성능 / 메모리 페널티가 있습니다-불운 한 경우 휘발성 변수에 대한 액세스 및 잠금.
참고 URL : https://stackoverflow.com/questions/1378310/performance-concurrenthashmap-vs-hashmap
'programing tip' 카테고리의 다른 글
Ubuntu에 Android SDK를 설치하는 방법은 무엇입니까? (0) | 2020.12.03 |
---|---|
PowerShell에서 디렉터리 변경 (0) | 2020.12.03 |
json 출력에 가상 속성 추가 (0) | 2020.12.03 |
문자열에 알파벳 문자가 포함되어 있는지 Javascript에서 확인하는 방법 (0) | 2020.12.03 |
부트 스트랩 모달을 더 넓게 만들려고 (0) | 2020.12.03 |