if two or more threads accessed tls in a dso that was loaded after
the threads were created, then __tls_get_new could do out-of-bound
memory access (leading to segfault).
accidentally byte count was used instead of element count when
the new dtv pointer was computed. (dso->new_dtv is (void**).)
it is rare that the same dso provides dtv for several threads,
the crash was not observed in practice, but possible to trigger.
/* Get new DTV space from new DSO if needed */
if (v[0] > (size_t)self->dtv[0]) {
void **newdtv = p->new_dtv +
- (v[0]+1)*sizeof(void *)*a_fetch_add(&p->new_dtv_idx,1);
+ (v[0]+1)*a_fetch_add(&p->new_dtv_idx,1);
memcpy(newdtv, self->dtv,
((size_t)self->dtv[0]+1) * sizeof(void *));
newdtv[0] = (void *)v[0];