Histogram bin counts - MATLAB histcounts (2024)

Histogram bin counts

collapse all in page

Syntax

[N,edges]= histcounts(X)

[N,edges]= histcounts(X,nbins)

[N,edges]= histcounts(X,edges)

[N,edges,bin]= histcounts(___)

N = histcounts(C)

N = histcounts(C,Categories)

[N,Categories]= histcounts(___)

[___] = histcounts(___,Name,Value)

Description

example

[N,edges]= histcounts(X) partitions the X values into bins and returns the bin counts and the bin edges. The histcounts function uses an automatic binning algorithm that returns uniform bins chosen to cover the range of elements in X and reveal the underlying shape of the distribution.

example

[N,edges]= histcounts(X,nbins) usesa number of bins specified by the scalar, nbins.

example

[N,edges]= histcounts(X,edges) sorts X into bins with the bin edges specified by the vector, edges.

example

[N,edges,bin]= histcounts(___) also returns an index array, bin,using any of the previous syntaxes. bin is an arrayof the same size as X whose elements are the binindices for the corresponding elements in X. Thenumber of elements in the kth bin is nnz(bin==k),which is the same as N(k).

example

N = histcounts(C),where C is a categorical array, returns a vector, N,that indicates the number of elements in C whosevalue is equal to each of C’s categories. N hasone element for each category in C.

N = histcounts(C,Categories) countsonly the elements in C whose value is equal tothe subset of categories specified by Categories.

example

[N,Categories]= histcounts(___) also returns the categoriesthat correspond to each count in N using eitherof the previous syntaxes for categorical arrays.

example

[___] = histcounts(___,Name,Value) specifies additional parameters using one or more name-value arguments. For example, you can specify BinWidth as a scalar to adjust the width of the bins for numeric data.

Examples

collapse all

Bin Counts and Bin Edges

Open Live Script

Distribute 100 random values into bins. histcounts automatically chooses an appropriate bin width to reveal the underlying distribution of the data.

X = randn(100,1);[N,edges] = histcounts(X)
N = 1×7 2 17 28 32 16 3 2
edges = 1×8 -3 -2 -1 0 1 2 3 4

Specify Number of Bins

Open Live Script

Distribute 10 numbers into 6 equally spaced bins.

N = 1×6 2 2 2 2 1 1
edges = 1×7 0 4.9000 9.8000 14.7000 19.6000 24.5000 29.4000

Specify Bin Edges

Open Live Script

Distribute 1,000 random numbers into bins. Define the bin edges with a vector, where the first element is the left edge of the first bin, and the last element is the right edge of the last bin.

X = randn(1000,1);edges = [-5 -4 -2 -1 -0.5 0 0.5 1 2 4 5];N = histcounts(X,edges)
N = 1×10 0 24 149 142 195 200 154 111 25 0

Normalized Bin Counts

Open Live Script

Distribute all of the prime numbers less than 100 into bins. Specify 'Normalization' as 'probability' to normalize the bin counts so that sum(N) is 1. That is, each bin count represents the probability that an observation falls within that bin.

X = primes(100);[N,edges] = histcounts(X, 'Normalization', 'probability')
N = 1×4 0.4000 0.2800 0.2800 0.0400
edges = 1×5 0 30 60 90 120

Determine Bin Placement

Open Live Script

Distribute 100 random integers between -5 and 5 into bins, and specify 'BinMethod' as 'integers' to use unit-width bins centered on integers. Specify a third output for histcounts to return a vector representing the bin indices of the data.

X = randi([-5,5],100,1);[N,edges,bin] = histcounts(X,'BinMethod','integers');

Find the bin count for the third bin by counting the occurrences of the number 3 in the bin index vector, bin. The result is the same as N(3).

count = nnz(bin==3)
count = 8

Categorical Bin Counts

Open Live Script

Create a categorical vector that represents votes. The categories in the vector are 'yes', 'no', or 'undecided'.

A = [0 0 1 1 1 0 0 0 0 NaN NaN 1 0 0 0 1 0 1 0 1 0 0 0 1 1 1 1];C = categorical(A,[1 0 NaN],{'yes','no','undecided'})
C = 1x27 categorical no no yes yes yes no no no no undecided undecided yes no no no yes no yes no yes no no no yes yes yes yes 

Determine the number of elements that fall into each category.

[N,Categories] = histcounts(C)
N = 1×3 11 14 2
Categories = 1x3 cell {'yes'} {'no'} {'undecided'}

Input Arguments

collapse all

XData to distribute among bins
vector | matrix | multidimensional array

Data to distribute among bins, specified as a vector, matrix,or multidimensional array. If X is not a vector,then histcounts treats it as a single column vector, X(:).

histcounts ignores all NaN values.Similarly, histcounts ignores Inf and -Inf valuesunless the bin edges explicitly specify Inf or -Inf asa bin edge.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | logical | datetime | duration

CCategorical data
categorical array

Categorical data, specified as a categorical array. histcounts ignoresundefined categorical values.

Data Types: categorical

nbinsNumber of bins
positive integer

Number of bins, specified as a positive integer. If you do notspecify nbins, then histcounts automaticallycalculates how many bins to use based on the values in X.

Example: [N,edges] = histcounts(X,15) uses15 bins.

edgesBin edges
vector

Bin edges, specified as a vector. The first vector element specifies the leading edge of the first bin. The last element specifies the trailing edge of the last bin. The trailing edge is only included for the last bin.

For datetime and duration data, edges mustbe a datetime or duration vector in monotonically increasing order.

CategoriesCategories included in count
all categories (default) | string vector | cell vector of character vectors | pattern scalar | categorical vector

Categories included in count, specified as a string vector, cell vector of character vectors, pattern scalar, or categorical vector. By default, histcounts uses a bin for each category in categorical array C. Use Categories to specify a unique subset of the categories instead.

Example: h = histcounts(C,["Large","Small"]) counts only the categorical data in the categories Large and Small.

Example: h = histcounts(C,"Y" + wildcardPattern) counts categorical data in all the categories whose names begin with the letter Y.

Data Types: string | cell | pattern | categorical

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: [N,edges] = histcounts(X,'Normalization','probability') normalizesthe bin counts in N, such that sum(N) is1.

Output Arguments

collapse all

N — Bin counts
row vector

Bin counts, returned as a row vector.

edges — Bin edges
vector

Bin edges, returned as a vector. The first element is the leading edge of the first bin. The last element is the trailing edge of the last bin.

bin — Bin indices
array

Bin indices, returned as an array of the same size as X.Each element in bin describes which numbered bincontains the corresponding element in X.

A value of 0 in bin indicatesan element which does not belong to any of the bins (for example,a NaN value).

Categories — Categories included in count
cell vector of character vectors

Categories included in count, returned as a cell vector of charactervectors. Categories contains the categories in C thatcorrespond to each count in N.

Tips

  • The behavior of histcounts issimilar to that of the discretize function. Use histcounts tofind the number of elements in each bin. On the other hand, use discretize tofind which bin each element belongs to (without counting).

Extended Capabilities

Version History

Introduced in R2014b

expand all

You can normalize histogram values as percentages by specifying the Normalization name-value argument as 'percentage'.

See Also

histogram | histogram2 | discretize | histcounts2 | kde

Topics

  • Replace Discouraged Instances of hist and histc

MATLAB Command

You clicked a link that corresponds to this MATLAB command:

 

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Histogram bin counts - MATLAB histcounts (1)

Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list:

Americas

  • América Latina (Español)
  • Canada (English)
  • United States (English)

Europe

  • Belgium (English)
  • Denmark (English)
  • Deutschland (Deutsch)
  • España (Español)
  • Finland (English)
  • France (Français)
  • Ireland (English)
  • Italia (Italiano)
  • Luxembourg (English)
  • Netherlands (English)
  • Norway (English)
  • Österreich (Deutsch)
  • Portugal (English)
  • Sweden (English)
  • Switzerland
    • Deutsch
    • English
    • Français
  • United Kingdom (English)

Asia Pacific

Contact your local office

Histogram bin counts - MATLAB histcounts (2024)

FAQs

What is the bin limit for histogram in Matlab? ›

Each bin is 1 century (100 calendar years). If you specify BinMethod for datetime or duration data, then histogram can use a maximum of 65,536 bins (or 216).

What is the difference between histogram and Histcounts in Matlab? ›

Both histogram and histcounts have automatic binning and normalization capabilities, with several common options built-in. histcounts is the primary calculation function for histogram . The result is that the functions have consistent behavior.

What happens when you increase the number of bins in a histogram? ›

If you have too many bins, then the data distribution will look rough, and it will be difficult to discern the signal from the noise. On the other hand, with too few bins, the histogram will lack the details needed to discern any useful pattern from the data.

What effect does a histogram with too few bins have? ›

Too few bins can hide important patterns, and too many bins can make small but expected fluctuations in data appear important. The following figure is an example of an appropriate number of bins for the data.

How to increase the number of bins in a Matlab histogram? ›

N = morebins( h ) increases the number of bins in histogram h by 10% (rounded up to the nearest integer) and returns the new number of bins. For bivariate histograms, this increases the bin count in both the x and y directions.

What is the rule of thumb for histogram bins? ›

Several rules of thumb exist for determining the number of bins, such as the belief that between 5 and 20 bins is usually adequate (for example, Matlab uses 10 bins as a default).

Why does bin size matter in histogram? ›

A 'bin' for each sample point gives us no more information, but only stretches the width of the chart. A good bin width will usually show a recognizable normal probability distribution curve, unless the data is really multi-modal. Then there could be two or more distinct 'humps' in the histogram chart.

What is Scott's rule for the number of bins? ›

Scott suggested using the Gaussian density as a reference standard, which leads to the data-based choice for the bin width of a × s × n−1/3, where a = 3.49 and s is an estimate of the standard deviation.

What are bin counts? ›

bincounts = histc( x , binranges ) counts the number of values in x that are within each specified bin range. The input, binranges , determines the endpoints for each bin. The output, bincounts , contains the number of elements from x in each bin.

How to calculate bins in Matlab? ›

nbins — Number of bins

Number of bins, specified as a positive integer. If you do not specify nbins , then histcounts automatically calculates how many bins to use based on the values in X . Example: [N,edges] = histcounts(X,15) uses 15 bins.

What is the transparency of a histogram in Matlab? ›

Transparency of histogram bars, specified as a scalar value in range [0,1] . histogram uses the same transparency for all the bars of the histogram. A value of 1 means fully opaque and 0 means completely transparent (invisible). Example: histogram (X,'FaceAlpha',1) creates a histogram plot with fully opaque bars.

How do you choose the number of bins for a histogram? ›

Choose between 5 and 20 bins. The larger the data set, the more likely you'll want a large number of bins. For example, a set of 12 data pieces might warrant 5 bins but a set of 1000 numbers will probably be more useful with 20 bins. The exact number of bins is usually a judgment call.

What is the maximum number of bins in a histogram? ›

There is no hard maximum for the number of bins in a histogram. If the variable being plotted is continuous, then an argument can be made for an infinite number of categories (and the histogram basically becomes a rug plot). The number of points in the data set is not an appropriate upper bound.

What happens when you change the bin width of a histogram? ›

When we change the bins, the data gets grouped differently. The different grouping affects the appearance of the histogram. To illustrate this point, we highlighted the five students who scored in the 70s in each histogram.

What is the range of a histogram bin? ›

The histogram is count of values in each bin interval. For example, if you had values 0–100, and decided you divide it into 5 bins, you would have the intervals: [0, 20], (20, 40], (40, 60], (60, 80], (80, 100]. The range for bin 1 is [0, 20], for bin 2 is (20, 40], etc.

What is the default value of bins in histogram? ›

The default value of the number of bins to be created in a histogram is 10.

Is there only one right number of bins for a histogram? ›

There is no single “optimal” number of bins, just an optimal number for communicating whatever it is that we need to say about the data.

What is the bin width of a histogram? ›

The bin-width is set to h=2×IQR×n−1/3. So the number of bins is (max−min)/h, where n is the number of observations, max is the maximum value and min is the minimum value. If you use too few bins, the histogram doesn't really portray the data very well.

Top Articles
Latest Posts
Article information

Author: Trent Wehner

Last Updated:

Views: 5751

Rating: 4.6 / 5 (56 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Trent Wehner

Birthday: 1993-03-14

Address: 872 Kevin Squares, New Codyville, AK 01785-0416

Phone: +18698800304764

Job: Senior Farming Developer

Hobby: Paintball, Calligraphy, Hunting, Flying disc, Lapidary, Rafting, Inline skating

Introduction: My name is Trent Wehner, I am a talented, brainy, zealous, light, funny, gleaming, attractive person who loves writing and wants to share my knowledge and understanding with you.