Basically exactly what the title says. In case there isn’t a great place, or this post ends up getting more visibility than wherever I end up asking I will explain my approximate competency level and the question below.

In terms of competency I have an engineering background and degree, which means I had a single class in statistics. Technically I was one class short of a math minor (Graph Theory) when I graduated. Unlike most engineers and Six Sigma “graduates” I don’t think this automatically makes me some kind of math/stats wizard. I’m aware I know just enough that I can unintentionally massage data to fit my bias (mini rant over).

My question is, when looking at a human population and trying to find the approximate subset of people with certain attributes how are correlations handled to avoid double counting?

For example let’s say I am looking at a specific city and my data sets are thee most recent census, BLS.gov, and Pew Research. With the above sources I can pretty easily estimate something along the lines of

The number men in a US city that are:

  • Between the ages of 22-44
  • Have a STEM degree

However, if I then wanted to add another factor:

  • Are/Vote liberal

I know that is going to interfere with the original criteria because higher levels of education are correlated with people being more liberal, thus if I just punched in the percentages from all three data points the resulting number is likely going to be much smaller than reality.

Is there a term or method I can read up on for how to account for overlaps/correlations between population subsets? Does this make sense or am I asking the wrong kind of question?

FWIW none of this is related to my job, an argument, a shit post, a data graphic, or anything else I will ever really make. It’s just for something specific (not the actually the above example but something like it using the sources I mentioned) I am personally curious about. I have also more generally been wondering about how to account for this kind of overlap for a couple of years now.

Regardless, thanks for taking the time to at least read all this.

Cheers!

  • Admiral Patrick@dubvee.org
    link
    fedilink
    English
    arrow-up
    5
    ·
    edit-2
    1 month ago

    The only one I can really see is !statistics@lemmy.world but it doesn’t appear to be active at all. The only moderator for it hasn’t posted anything in a year or so.

    Looks like some “if you build it, they will come” is needed.

    Not sure of the procedure, but you may reach out to the LW admins to see about taking over the community if the mod is confirmed AWOL.

  • lolola
    link
    fedilink
    arrow-up
    3
    ·
    1 month ago

    I don’t know of such a community, but I’ll point you in the direction of conditional probability.

    Also:

    Six Sigma “graduates”

    Lol fair description