Having, Doing, and Being

첫 도전에 한 문제 풀었다.. 다음에는 두 문제 풀어야지. 각 문제 마다도 Test Set 이 쉬운 것과 어려운 것이 주어지는데, 다음에는 어려운 것만 풀기보다 일단 쉬운 것도 풀어봐야 겠다.

Allocation (5pts, 7pts)

그리디라서 신나게 풀었다. 하지만 해설은 넘 어렵

6’20”

#include <iostream>
using namespace std;
#include <algorithm>
int main() {
	int T;
	cin >> T;
	for (int t = 1; t <= T; t++) {
		int n, b;
		cin >> n >> b;
		int* arr = new int[n]();
		for (int i = 0; i < n; i++) {
			cin >> arr[i];
		}
		sort(arr, arr + n);

		int ans = 0;
		for (int i = 0; i < n; i++) {
			if (arr[i] <= b) {
				b -= arr[i];
				ans++;
			}
		}

		printf("Case #%d: %d\n", t, ans);
	}
}

Analysis

최대한 많은 수의 집을 사는 것이므로, 집 가격이 저렴한 순으로 정렬을 해서 구매한다.

시간 복잡도

정렬하는 부분은 보통 O(nlogn) 의 복잡도를 가지고, 정렬 이후의 부분은 O(N) 이다. counting sort 를 쓰면 복잡도를 O(N)까지 낮출 수 있다고 한다. (배열 요소의 최대값이 1000 이기 때문)

Counting Sort, 계수 정렬: 각 숫자의 개수를 세고, 새로운 배열에 저장한다. 그리고 그 누적합이 기존 배열에서의 숫자의 위치가 된다. 그러나 메모리 낭비가 심하다.

그리디 알고리즘

그리디 알고리즘을 쓴 결과가 A = {a1, a2, … , ak} 이고, 정답이 O = {o1, o2, … ,om} 일때 A 와 O 가 같으면 그리디 알고리즘을 쓰는 것이 옳음을 증명할 수 있다.

O 에는 있지만 A 에는 없는 최소 하나의 oj 를 생각해보자.

우리는 항상 기존 배열에서 최솟값만 제거하기 때문에, A 에 없는 어떤 원소는 A 에 있는 원소 ai 보다 크거나 같은 것을 알 수있다.

We could replace oj with the absent element in A without worsening the solution, because there will always be element in A that is not in O. We then increased number of elements in common between A and O, hence we can repeat this operation only finite number of times. We could repeat this process until all the elements in O are elements in A. Therefore, A is as good as any optimal solution. ?

Plates (9ts, 15pts)

DP 인걸 아는데.. 왜 풀질 못하니.. 괜히 제대로 쓰지도 못하는데 탑다운 함수 만들어서 풀다가.. 다 헷갈렸다. 사실 바텀업 푸는 것도 잊어버렸지만.

#include <iostream>
using namespace std;
#include <algorithm>
int main() {
	int T;
	cin >> T;
	for (int t = 1; t <= T; t++) {
		int n, k, p; cin >> n >> k >> p;
		int sum[51][31];
		for (int i = 1; i <= n; i++) {
			sum[i][0] = 0;
			for (int j = 1; j <= k; j++) {
				cin >> sum[i][j];
				sum[i][j] += sum[i][j-1];
			}
		}

		int dp[51][1501];
		memset(dp, 0, sizeof(dp));
		for (int i = 1; i <= n; i++)
			for (int j = 0; j <= p; j++)
				for (int x = 0; x <= min(j, k); x++)
					dp[i][j] =
					max(dp[i][j], sum[i][x] + dp[i - 1][j - x]);

		printf("Case #%d: %d\n", t, dp[n][p]);
	}
}

Analysis

dp[i][j] 를 번호 i 까지의 스택들을 사용해서 총 j 개의 접시들을 고르는 최대합이라고 생각하자. 그러면 우리가 구하고자 하는 답은 dp[N][P] 이다.

dp 를 효율적으로 계산하기 위해서는, ‘스택 i 에서 x 개의 접시를 선택할 때의 합’ 을 효율적으로 구해야한다. 따라서 sum[i][x] 를 스택 i 에서 x 개의 접시를 선택할 때의 합이라고 하고, N 개의 스택에 대해 미리 계산한다.

다음으로는 스택을 돌면서 ‘i 번째 까지의 스택들을 사용해서 j 개의 접시를 사용하는 최대 합은 얼마인지’ 판단해야한다. 그리고 ‘j 개의 접시들 중 i 번째 스택에서 온 접시가 몇 개인지’ 판단해야한다.

i 까지의 스택에서 x 개의 접시를 선택할 때, dp[i][j] = max(dp[i][j], sum[i][x] + dp[i-1][j-x]) 가 된다.

따라서 i 까지의 스택에서 j 개의 접시를 선택하려면 i 번째 스택에서 [0,1,…, j] 개의 접시를 선택하고, i-1 번째 스택에서 [j,j-1,…, 0] 개를 선택해야한다. 또한, 1<=j<=P 인 j 에 대해서 실행해야한다.

for i [1, N]: for j [0, P]: dp[i][j] := 0 for x [0, min(j, K)]: dp[i][j] = max(dp[i][j], sum[i][x]+dp[i-1][j-x])

시간복잡도는 O(NPK) 가 된다.

0-1 Knapsack Problem, 배낭문제

: 배낭 문제(Knapsack Problem)는 조합 최적화의 유명한 문제이다. 간단하게 말하면, 한 여행가가 가지고 가는 배낭에 담을 수 있는 무게의 최댓값이 정해져 있고, 일정 가치와 무게가 있는 짐들을 배낭에 넣을 때, 가치의 합이 최대가 되도록 짐을 고르는 방법을 찾는 문제이다. 이 배낭문제는 짐을 쪼갤 수 있는 경우(무게가 소수일 수 있는 경우)와 짐을 쪼갤 수 없는 경우(이 경우 짐의 무게는 0 이상의 정수만 가능) 두 가지로 나눌 수 있는데, 짐을 쪼갤 수 있는 경우의 배낭문제를 분할가능 배낭문제(Fractional Knapsack Problem), 짐을 쪼갤 수 없는 경우의 배낭문제를 0-1 배낭문제(0-1 Knapsack Problem)라 부른다. 이 문제는 쪼갤 수 있는 경우에는 그리디 알고리즘으로 다항 시간에, 쪼갤 수 없는 경우에는 동적계획법(Dynamic Programming)등으로 의사 다항 시간에 풀 수 있다. 단, 쪼갤 수 없는 경우는 NP-완전이기 때문에 알려진 다항 시간 알고리즘은 없고, FPTAS만 존재한다. 배낭 문제에 대한 FPTAS는 오스카 이바라와 김철언이 1975년에 개발하였다.

https://ko.wikipedia.org/wiki/배낭_문제

Workout (11pts, 18pts)

와.. 어쩐지 이분 탐색인 것 같았는데. 어렴풋이 스킬은 알았지만, 이제 구체화해서 변형 문제도 풀 수 있도록 해야겠다.

Analysis

n 개의 숫자에서 연속된 두 숫자의 최대 거리가 최소가 되도록 k 개의 숫자를 추가한다. 최대 차이의 최솟값을 출력한다.

테스트셋 1에 대해서는 k 가 1이기 때문에, 기존 배열에서의 최대 거리를 반으로 나누는 것이 정답이다. 예를 들어 [ 2,12,18 ] 이 있으면 [ 2, 7, 12, 18] 로 7 을 추가해, 결국 최대 거리는 18 - 12 = 6이 된다. 시간 복잡도는 기존 배열에서 거리 차이를 구하고, 최대값을 찾고, 다시 비교해야하기 때문에 O(N) 이다.

테스트셋 2에 대해서는 k 가 여러개이기 때문에 최대 거리의 절반에 숫자를 추가하는 것이 최적의 정답이 아니다. [ 2, 12 ] 가 있을 때, 먼저 [ 2, 7, 12 ] 로 나누고 그 다음 [ 2, 7, 9, 12] 로 나누면 답은 5가 된다. 그러나 최적의 정답은 [ 2, 5, 8, 12 ]로 답은 4이다.

i 번째 두 정수의 차이를 di 라고 하자. 어떤 수를 이 사이에 추가해서 수열의 최대 차이가 특정 값(doptimal 이라고 하자)을 넘지 않을 때, 이 사이에 몇 개의 수를 추가할 수 있을까? 정답은 (di / doptimal) - 1 개를 올림한 값이다. 이 개수를 k’i 라고 하자. 모든 N-1 개의 차이에 대해서 k’[1, …, N-1]이 있을 것이다. 또한 이것의 총합인 k’sum = k’1 + k’2 + … + k’N+1 이고, k’sum <= K 이어야 한다.

그렇다면 이제 정답인 doptimal 을 구해야한다. doptimal 은 [1, max(di)] (1 ≤ i ≤ N-1) 사이에 있을 것이다. 따라서 선형탐색으로 1 부터 확인해보면서, 위의 조건을 만족하는 값을 바로 출력하면된다. 더 빠르게 하려면 이분 탐색을 쓰면 된다. 참고로, ki = (di / doptimal) - 1 에서 doptimal 이 커지면 위의 값은 작아지고, 따라서 k’sum 의 값도 작아 지는 것을 알 수 있다. 따라서, [1, max(di)] 에서 이분 탐색을 쓰면서

On closer observation, you can see that increasing the value of doptimal decreases the value of ceiling(di / doptimal)-1 and hence smaller is the value of k’sum. Therefore, we can perform a binary search in the range [1, max(di)] to find the least value of doptimal that makes k’sum ≤ K. That is our answer. Since the max(di) could be as much as 109, we might have to search [1, 109] making time complexity of the solution is O(log(109)* N).

3 5 2 10 13 15 16 17 5 6 9 10 20 26 30 8 3 1 2 3 4 5 6 7 10

Case #1: 2 Case #2: 3 Case #3: 1

Bundling (14pts, 21pts)

Analysis

We need to maximise the sum of scores of each bundle. Let us consider a bundle and say longest prefix shared by all strings of this bundle is of length X. Now each prefix of length from 1 to X is shared by all of the strings in this bundle. Consider any prefix among these X prefixes, it is counted once in the score of this bundle. Thus the score of a bundle can be defined as number of prefixes that are shared by all of the strings in this bundle. Thus if a prefix is shared by all strings in Y bundles, then it will contribute Y to the total sum of scores. Now instead of finding the total score, we find the contribution of each prefix in the total score. So for maximising the total score, we would want to maximize the contribution of each prefix in the total score. Let the contribution of each prefix PRE be contibution(PRE). We want to maximize ∑ contribution(PRE) where PRE comprises all possible prefixes of the given strings. Let us say a prefix Pi is a prefix of S strings. The maximum number of bundles of size K formed by S strings is ⌊ S / K ⌋. In each of these ⌊ S / K ⌋ bundles, prefix Pi will add to their scores. Thus maximum value of contribution(Pi) can be ⌊ S / K ⌋. So a prefix Pi which occurs as a prefix of S strings will contribute ⌊ S / K ⌋ to the answer. Let us see if we can achieve this maximum value for all the prefixes. Consider a prefix P of length L. It occurs as a prefix in CNT number of strings. Now consider there are C prefixes of length L + 1 which contain the prefix P as a prefix (P1, P2, ….,PC). And we have stored the number of strings these prefixes are part of as (CNT1, CNT2, ….,CNTC). Let us say we divided the strings which have prefix Pi into ⌊ (CNTi / K) ⌋ budles. Now we have CNTi%K strings remaining for each prefix that we need to arrange so that they make a bundle. For each of these remaining strings we cannot have a bundle of size K which would have a common prefix of length L + 1 because we have CNTi%K remaining strings for each Pi. So, we can make bundles in any order using the remanining strings. Those bundles will still have a prefix of length L shared among them. Thus we would be left with CNT%K number of strings which are not in any bundle when we consider prefix P. We can continue this procedure till we are left with prefix of length 0. We would be able to bundle all the strings at this point because we would have N % K strings remaining, and as specified in the problem, N is divisible by K. The problem is now reduced to finding number of times each prefix occurs in the given strings. Let this number be COUNT. We just need to add ⌊ COUNT / K ⌋ to the answer for each prefix. Test set 1 The length of each string is at most 5. Thus we have total number of prefixes as N * 5 and each prefix can be of size at most 5. Store each prefix in a hashmap and increase the count for each prefix. In the end, we just need to add ⌊ (count(i) / K) ⌋ for each prefix i. The complexity of the solution would be O(N * 5 * 5). Test set 2 Let the sum of length of all strings over all the test cases be SUM which is 106. For the large test set, the length of the string can be very large. So, we can’t store all the prefixes in a hashmap. We need to store all the prefixes in an efficient manner along with the number of times they occur in given strings. We can use a data structure trie. The insertion cost would be equal to sum of length of strings over the test cases which is O(SUM). Then finally we just need to traverse the trie and for each prefix, add its contribution to the answer. Time complexity of the solution would be O(SUM).

1 6 3 RAINBOW FIREBALL RANK RANDOM FIREWALL FIREFIGHTER

Case #1: 6