Cohesion represents the degree to which the elements of a module belong together. A module or class is said to be highly cohesive if its methods and data are highly related, meaning that a change in one affects just a small number of elements.
We can use this metric to know whether or not an object is in good shape or needs refactoring.
Typically, a class should have methods that are related to each other and share a common purpose, and this relationship should be in the state shared by them. It goes hand in hand with the Single Responsibility pattern in our favorite SOLID principles. (here my article about that if you need a refresher: "SOLID Principles in C#"
So naturally, what we want is high cohesion. That means the elements of a module or component are closely related to each other, and they all contribute to achieving the module's or component's single, well-defined purpose. This can make the code more modular, easier to understand, and easier to maintain.
Let's have a look at an example with low cohesion:
class UserInfoService
{
private readonly string _baseUrl;
private readonly IUrlValidator _validator;
private readonly HttpClient _client;
public UserInfoService(string baseUrl, IUrlValidator validator, HttpClient client)
{
_baseUrl = baseUrl;
_validator = validator;
_client = client;
}
public bool ValidateUrl(string url)
{
return _validator.IsValid(baseUrl, url);
}
public async Task<string> GetUserInfo(string userId)
{
var response = await _client.GetAsync($"https://api.example.com/users/{userId}");
response.EnsureSuccessStatusCode();
var content = await response.Content.ReadAsStringAsync();
return content;
}
}
Now it is quite small. The problem is that the class does two different things simultaneously: Validating and Getting something from an API endpoint. And with cohesion, we can see this very easily. We have three fields, _baseUrl
, _validator
and _client
, which are used in two separate methods. There is only a relation between _baseUrl
and _validator
but no one of those two is related to _client
. (Of course, this is an example and maybe you could use _baseUrl
in GetUserInfo
but stay with me that we don't need this in my example)
We don't need one to fulfill the requirements for the other one. So they could stand alone. So when you see that you have fields, which are not related and are always called separately, chances are high that you can create two smaller objects out of it. And in our example, it is obvious as we validate and retrieving some data from somewhere are two separate concerns. So we could have two classes like that:
class UrlValidator
{
private readonly string _baseUrl;
private readonly IUrlValidator _validator;
public UrlValidator(string baseUrl, IUrlValidator validator)
{
_baseUrl = baseUrl;
_validator = validator;
}
public bool ValidateUrl(string url)
{
return _validator.IsValid(baseUrl, url);
}
}
class UserInfoRetriever
{
private readonly HttpClient _client;
public UserInfoService(HttpClient client)
{
_client = client;
}
public async Task<string> GetUserInfo(string userId)
{
var response = await _client.GetAsync($"https://api.example.com/users/{userId}");
response.EnsureSuccessStatusCode();
var content = await response.Content.ReadAsStringAsync();
return content;
}
}
Conclusion
Cohesion can help us to see how well our objects work together to achieve a well-defined single goal! If you see that they are kind of unrelated you might want to refactor those.